For AI, depending on your purpose, you will need a model

A model is a compressed data file containing statistical “knowledge.” It is not a program with fixed rules (if A happens, do B), but rather the result of massive training.

The model is shown billions of examples (texts, images, or music).

The Model: It is a complex mathematical structure (neural network). When you give it a prompt, the model doesn’t “search” a database, but rather calculates which pixels or words it should continue based on what it has learned.

It is independent of language. You don’t need Java, C, or Python.

Standard Protocol: You communicate with the AI ​​using JSON over HTTP

You don’t need a dataset.

You’re not going to “train” the AI ​​(companies with supercomputers have already done that); you’re going to use models that already know how to do their job.

The Model (Engine): It comes pre-built. It already knows how to generate text or images.

The Dataset (Food): The model already “consumed” it during its creation. You don’t have to give it millions of data points; the model file is already several GB because it has already “learned” all of that.
You only need the “Engine” (the model file) and to put it on your server.

Inference: This is the act of using that model to generate something new. This is what you would do on your server.

An AI model would be like a supermodel

  • The Llama 3 template for shaping text
  • The Stable Diffusion template for shaping images
  • The MusicGen template for shaping sounds

There are thousands of “Checkpoints” (variations of the model) created by the community for specific styles

analogy:

A truck engine (video model) is not the same as a motorcycle engine (light text model). The model is the “file” with the mathematical weights that knows how to process the information. Without an engine, the car won’t start.

For the engine to “know” how to run, it had to consume millions of images and texts over months. Without that initial “fuel” (dataset), the engine is just an empty metal block.

Prompts aren’t fuel, they’re the driver’s instructions.

The prompt tells the motor: “Accelerate,” “Turn left,” or “Brake.” It’s the direction you give to the car’s power. If you give it the wrong prompt, the car will end up on the shoulder.

An engine alone isn’t very useful. You need the chassis, the wheels, the steering wheel, and the body. AI is the combination of all of these working in harmony.

Everything is done in the terminal, console, or tty; you will need two different consoles.


Routes are the different paths your AI can take depending on what you want to generate.

  • Text Route: A fast and direct highway (Llama 3)
  • Image Route: A scenic and detailed road (Flux.1)
  • Video/Sound Route: Complex routes that require more processing power and time

For that “engine” (model) to perform well, you need a good battery and oil, which in computing is the VRAM of your GPU.

If you try to run a Formula 1 engine (Flux.1) on a Fiat 600 (a server without a dedicated graphics card), the engine will either burn out or simply won’t start.


execute to install ollama

brew install ollama

Once installed, you need to start the service that manages the models:

ollama serve

Leave that terminal open or minimized


Open a new terminal and run:

ollama run llama3
  • What will happen? Ollama will detect that you don’t have the model, download it (it’s about 4.7 GB), and then a cursor will appear saying >>>.
  • Now you can talk to it! Type anything in Spanish, and it will respond using the power of your processor (CPU) and RAM.

As you can see, she already knows Spanish and English because she has already been trained in those languages

During its creation (the training), the model “read” almost the entire public internet: Wikipedia, books, forums, code repositories (like the Java and C ones you use) and news.

  • The model processed billions of sentences in Spanish and just as many in Mandarin.
  • It learned that Spanish words have a statistical relationship with each other. It knows that after “Hola, ¿cómo…” the next word is most likely “estás?” and not a Chinese word.