Ollama Basic Concepts | Beginner Tutorial
1. Model
In Ollama, models are the core components. They are pre-trained machine learning models that can perform various tasks such as text generation, text summarization, sentiment analysis, and dialogue generation.
Ollama supports multiple popular pre-trained models, including:
- deepseek-v3: A large language model provided by DeepSeek, specifically designed for text generation tasks.
- LLama2: A large language model provided by Meta, specifically designed for text generation tasks.
- GPT: The GPT series of models by OpenAI, suitable for wide-ranging dialogue generation and text inference tasks.
- BERT: A pre-trained model used for sentence understanding and question answering systems.
- Other Custom Models: Users can upload their own custom models and use Ollama for inference.
Main functions of models include:
- Inference: Generate output results based on user input.
- Fine-tuning: Users can train existing models using their own data to customize them for specific tasks or domains.
Models are typically composed of neural networks with numerous parameters, which learn language patterns through training on large text data and perform efficient inference.
The models supported by Ollama can be accessed at: https://ollama.com/library
Clicking on a model will display the download command:
| Model | Parameters | Size | Download Command |
|---|---|---|---|
| Llama 3.3 | 70B | 43GB | ollama run llama3.3 |
| Llama 3.2 | 3B | 2.0GB | ollama run llama3.2 |
| Llama 3.2 | 1B | 1.3GB | ollama run llama3.2:1b |
| Llama 3.2 Vision | 11B | 7.9GB | ollama run llama3.2-vision |
| Llama 3.2 Vision | 90B | 55GB | ollama run llama3.2-vision:90b |
| Llama 3.1 | 8B | 4.7GB | ollama run llama3.1 |
| Llama 3.1 | 405B | 231GB | ollama run llama3.1:405b |
| Phi 4 | 14B | 9.1GB | ollama run phi4 |
| Phi 3 Mini | 3.8B | 2.3GB | ollama run phi3 |
| Gemma 2 | 2B | 1.6GB | ollama run gemma2:2b |
| Gemma 2 | 9B | 5.5GB | ollama run gemma2 |
| Gemma 2 | 27B | 16GB | ollama run gemma2:27b |
| Mistral | 7B | 4.1GB | ollama run mistral |
| Moondream 2 | 1.4B | 829MB | ollama run moondream |
| Neural Chat | 7B | 4.1GB | ollama run neural-chat |
| Starling | 7B | 4.1GB | ollama run starling-lm |
| Code Llama | 7B | 3.8GB | ollama run codellama |
| Llama 2 Uncensored | 7B | 3.8GB | ollama run llama2-uncensored |
| LLaVA | 7B | 4.5GB | ollama run llava |
| Solar | 10.7B | 6.1GB | ollama run solar |
2. Task
Ollama supports multiple NLP tasks. Each task corresponds to different application scenarios of the model, mainly including but not limited to the following:
- Chat Generation: Generating natural conversation replies through interaction with users.
- Text Generation: Generating natural language text based on given prompts, such as writing articles, generating stories, etc.
- Sentiment Analysis: Analyzing the emotional tone of given text (positive, negative, neutral).
- Text Summarization: Compressing long texts into concise summaries.
- Translation: Translating text from one language to another.
Users can specify different tasks and load different models to complete specific tasks through command-line tools.
3. Inference
Inference refers to the process of processing inputs on a trained model to generate outputs.
Ollama provides easy-to-use command-line tools or APIs, allowing users to quickly provide inputs to the model and obtain results.
Inference is one of the main functions of Ollama and also the core way to interact with the model.
- Input: Users provide text input to the model, which can be a question, prompt, or dialogue content.
- Model Processing: The model generates appropriate outputs based on its built-in neural network according to the input.
- Output: The model returns the generated text content, which may be a reply, an article, translated text, etc.
Ollama interacts with local models via API or CLI, making it easy for users to achieve inference tasks.
4. Fine-tuning
Fine-tuning involves further training a pre-trained model on specific domain data to improve its performance in specific tasks or domains.
Ollama supports fine-tuning functionality. Users can fine-tune pre-trained models using their own datasets to customize the model's output.
- Data Preparation: Users prepare specific domain datasets, usually in text file or JSON format.
- Loading Pre-trained Model: Select a pre-trained model suitable for fine-tuning, such as the LLama2 or GPT model.
- Training: Train the model using the user's specific dataset to make it better adapted to the target task.
- Save and Deploy: After training, the fine-tuned model can be saved and deployed for future use.
Fine-tuning helps the model perform more accurately and efficiently when handling specific domain problems.
YouTip