Introducing DictaLM -- A Large Generative Language Model for Modern Hebrew

AI-generated keywords: DictaLM

AI-generated Key Points

  • DictaLM is a large generative language model designed for Modern Hebrew with 7 billion parameters
  • Both the foundation model and instruct-tuned model are available under a Creative Commons license
  • Introduction of DictaLM-Rab, a foundation model tailored for Rabbinic/Historical Hebrew
  • Architecture based on transformer architecture with enhancements including normalization techniques, GeLU activation functions, rotary embeddings, and separate embedding/output weights
  • Training details and hyperparameters provided using the NeMo framework
  • Models offered as starting points for fine-tuning various Hebrew-specific tasks such as instruction, Q&A, sentiment analysis
  • Dedication to promoting research in Hebrew NLP and fostering innovation within natural language processing
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Shaltiel Shmidman, Avi Shmidman, Amir David Nissan Cohen, Moshe Koppel

License: CC BY 4.0

Abstract: We present DictaLM, a large-scale language model tailored for Modern Hebrew. Boasting 7B parameters, this model is predominantly trained on Hebrew-centric data. As a commitment to promoting research and development in the Hebrew language, we release both the foundation model and the instruct-tuned model under a Creative Commons license. Concurrently, we introduce DictaLM-Rab, another foundation model geared towards Rabbinic/Historical Hebrew. These foundation models serve as ideal starting points for fine-tuning various Hebrew-specific tasks, such as instruction, Q&A, sentiment analysis, and more. This release represents a preliminary step, offering an initial Hebrew LLM model for the Hebrew NLP community to experiment with.

Submitted to arXiv on 25 Sep. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2309.14568v1

, , , , The paper presents DictaLM, a large generative language model specifically designed for Modern Hebrew. With an impressive 7 billion parameters, this model is extensively trained on Hebrew-centric data to ensure accuracy and relevance. In support of research and development in the Hebrew language, the authors have made both the foundation model and the instruct-tuned model available under a Creative Commons license. Additionally, they introduce DictaLM-Rab, another foundation model tailored for Rabbinic/Historical Hebrew, catering to a broader range of linguistic needs. The architecture of DictaLM is based on the transformer architecture with several enhancements aimed at improving training stability and overall performance. These enhancements include normalization techniques, GeLU activation functions, rotary embeddings for extending sequence length without compromising performance, and separate embedding and output weights for better performance. The authors provide training details and hyperparameters using the NeMo framework known for its optimization in training compute-heavy machine learning models. Furthermore, the authors highlight their dedication to promoting research in Hebrew NLP by offering these models as ideal starting points for fine-tuning various Hebrew-specific tasks such as instruction, Q&A, sentiment analysis, among others. This release marks an initial step towards providing a comprehensive language model for the NLP community to experiment with. The authors' commitment to advancing research in Modern Hebrew through sophisticated language models showcases their dedication to fostering innovation and development within the field of natural language processing.
Created on 27 Feb. 2024

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.