semANT at ILIDE conference
In September 2024 semANT has been presented at ILIDE 2024 international conference in Jasná, Slovakia
semANT at ILIDE conference Read More »
In September 2024 semANT has been presented at ILIDE 2024 international conference in Jasná, Slovakia
semANT at ILIDE conference Read More »
11th April 2024 – Petr Žabička přesented semANT in his talk at the Masaryk university.
semANT presented at 45th Archival Thursday Read More »
Semant project has been presented in July 2024 by Filip Jebavý and Petr Žabička at LIBER 2024 Annual Conference at the Cyprus University of Technology (CUT) library in Limassol, Cyprus.
Project presentation at LIBER Conference Read More »
Join us on Friday, January 31, 2025, at the Faculty of Information Technology (Brno University of Technology, room Q201) during “Den otevřených dveří” to explore how Project semANT is revolutionizing access to digitized printed and handwritten texts. We’ll demonstrate PERO (handwritten text recognition you can test with your own handwriting), OMR (optical music recognition), and
Presentation at FIT BUT – 31.1. Read More »
Czech Large Language Model CSMPT-7B In March 2024, we publicly released the first Czech-only large language model csmpt7b. Our language model was trained on dataset collected from Czech internet, Internet Archive, and also on publicly available historical texts ranging from the year 1850 until now. The texts were transcribed using our Pero OCR system. Training
Czech Large Language Model CSMPT-7B Read More »
In 2023, semANT brought its first tangible fruit: the TextBite software package. TextBite provides a semantic layout analysis on top of plain OCR output. It enhances a PAGE XML description of an analyzed page by introducing title elements, clustering text lines in semantically related parts (chapters, articles, dictionary entries, …), reading order and altering already present regions as needed. All of this new information is stored in a standard way described by the PAGE standard, allowing for further processing.