Location>code7788 >text

IBM's open source document transformation tool, "GitHub Hotspot".

Popularity:588 ℃/2024-11-12 08:33:08

Last week's hot open source project, Star number growth like a rocket, soaring. In just a week soared 6k Star multi-format document parsing and exporting God Docling, support for libraries and command line usage. Maxun, a new visualization crawler platform, easily gained 4k stars when it was first open sourced, and Actual, a local-first personal finance tool, supports Docker self-hosting, allowing users to take data into their own hands. If you're looking for machine learning Python libraries, check out best-of-ml-python, which covers 34 categories and a total of 920 excellent machine learning Python libraries.

Finally, the free Windows app uninstaller (Bulk-Crap-Uninstaller) and B station video airborne assistant (BilibiliSponsorBlock), with its simple and practical out-of-the-box features, quickly won the favor of the majority of users.

  • Table of Contents for this article
    • 1. Popular open source projects
      • 1.1 Multi-format document parsing and export tool: Docling
      • 1.2 Locally prioritized personal finance tools: Actual
      • 1.3 Simple and clean Hugo theme: hugo-PaperMod
      • 1.4 Open source code-free web data extraction platform: Maxun
      • 1.5 List of top machine learning Python libraries: best-of-ml-python
    • 2. HelloGitHub Hot Reviews
      • 2.1 Station B Video Airborne Assistant: BilibiliSponsorBlock
      • 2.2 Free Windows App Uninstaller: Bulk-Crap-Uninstaller
    • 3. Conclusion

1. Popular open source projects

1.1 Multi-format document parsing and export tool: Docling

Main language: PythonStar:7.9kWeekly growth: 6k

This is a Python tool open-source by IBM, specialized in converting all kinds of documents into tools suitable for generative AI. It can PDF, DOCX, PPTX, images, HTML, Markdown and other popular document formats, export to Markdown and JSON format, support for a variety of OCR engine (PDF), unified document object (DoclingDocument), easy to integrate retrieval-enhanced generation (RAG) and question-and-answer applications for the need for documents as input to generative AI models. documents as input to generative AI models.

from docling.document_converter import DocumentConverter

source = "url"  # document per local path or URL
converter = DocumentConverter()
result = (source)
print(.export_to_markdown())  # output: "## Docling Technical Report[...]"

GitHub address →/DS4SD/docling

1.2 Locally prioritized personal finance tools: Actual

Main language: TypeScriptStar:15kWeekly growth: 600

This is a completely free and open source, local-first personal finance tool. It is written with a clean interface and intuitive cash flow reports, supports Docker self-build, importing transaction data and multi-device synchronization, as well as optional end-to-end encryption, focusing on protecting user privacy and data security.

GitHub address →/actualbudget/actual

1.3 Simple and clean Hugo theme: hugo-PaperMod

Primary language: HTMLStar:10k

This is a fast, clean and responsive Hugo theme. It is developed based on hugo-paper and adds more features and customization options to it, supporting multi-language, automatic switching between light and dark themes, SEO friendly, social media sharing buttons, cover images, navigation bar and other features. In addition, it also provides three modes: general, homepage info and profile, which can be used to quickly build different styles of personal blogs.

GitHub address →/adityatelange/hugo-PaperMod

1.4 Open source code-free web data extraction platform: Maxun

Main language: TypeScriptStar:4kWeekly growth: 3k

This is a new no-code web page data extraction platform , no programming can easily grab the site's data , support for list/text capture , screenshots , custom agents , automatic handling of paging and scrolling and other features . As a new open source project , its features are still iterating , and plans to implement new features such as adapting to changes in site layout and data extraction after login .

GitHub address →/getmaxun/maxun

1.5 List of top machine learning Python libraries: best-of-ml-python

Main language: OtherStar:17kWeekly growth: 1.2k

The project provides a list of high-quality machine learning Python libraries , including more than 900 open source projects , and ranked according to the project quality score , updated weekly . All open source projects are divided into more than 30 categories , including machine learning frameworks , data visualization , natural language processing , OCR , model order deployment , etc., so that developers in different application areas can quickly find the required machine learning tools and resources.

GitHub address →/ml-tooling/best-of-ml-python

2. HelloGitHub Hot Reviews

In this section, we'll introduce you to this week's top open source projects on HelloGitHub, and we're not only looking forward to your open source wizardry and programming knowledge, but we're also eager to "hear" from you. We'd love to hear from you.Hands-on experience and evaluation of open source projectsThe most important thing is that the authors of open source projects are motivated by the most honest feedback.

2.1 Station B Video Airborne Assistant: BilibiliSponsorBlock

Main language: TypeScript

This is a browser plugin that can automatically skip the cha-mei clips and the opening and closing animations in the videos on station B. All the labeling data is contributed by users, and it supports Chrome, Edge and FireFox browsers.

Project Details→/repository/298fa9ba909c49428c1dc7f8c401bbbd

2.2 Free Windows App Uninstaller: Bulk-Crap-Uninstaller

Main language: C#

This is a Windows software uninstaller developed in C# to quickly remove a large number of unwanted applications. It's completely free, out-of-the-box, and supports batch and force uninstallation, cleaning up residual files, detecting hidden or protected registered applications, and more. Although designed for IT professionals, its simple default settings make it easy for anyone to get started.

Project Details→/repository/e5745984014e47f1a33648c0425256a0

3. Conclusion

Above is the whole content of this issue of "GitHub Hot Spot", I hope you can find the open source projects you are interested in here, if you have other fun, interesting GitHub open source projects you want to share, welcome to theHelloGitHub Share and discuss with us.

Past Review

  • Open source API learning platform
  • Build Your Own Internet Archive