3 miesięcy temu · 6b71f444fa
--- a/README.adoc
+++ b/README.adoc
@@ -9,6 +9,7 @@ These are the docs:
 
				 * link:docs/JUPYTERLAB.adoc[JupyterLab]: modify your base Conda env to run JupyterLab and easily execute notebooks in other envs
			
 
				 
			
 
				 For fun:
			
 
				+
			
 
				 * link:docs/INSTRUCTLAB.adoc[InstructLab]: how to play with HuggingFace models from InstructLab
			
 
				 
			
 
				 == Magic Time ==
			
--- a/docs/INSTRUCTLAB.adoc
+++ b/docs/INSTRUCTLAB.adoc
@@ -4,7 +4,7 @@
 
				 
			
 
				 There's a large variety of https://huggingface.co/models[models] available from https://huggingface.co[HuggingFace], and https://huggingface.co/instructlab[InstructLab] is an open-source collection of LLMs with tools that allow users to both use, and improve, LLMs based on Granite models.
			
 
				 
			
 
				-There are also model container images available on https://catalog.redhat.com/search?gs&q=granite%208b[Red Hat Ecosystem Catalog] (the link is just for the Granite 8b family).
			
 
				+There are also OCI model images available on https://catalog.redhat.com/search?gs&q=granite%208b[Red Hat Ecosystem Catalog] (the link is just for the Granite 8b family).
			
 
				 
			
 
				 A https://developers.redhat.com/articles/2024/08/01/open-source-ai-coding-assistance-granite-models[Red Hat blog] by Cedric Clyburn shows how you can use Ollama and InstructLab to run LLMs locally in a lot more detail, so I'll keep it short and with a focus on Conda here.
			
 
				 
			
@@ -122,9 +122,13 @@ The format of the model is HuggingFace _safetensors_, which requires the https:/
 
				 
			
 
				 From here on, there are two options: either install vLLM manually, or use `llama.cpp` to convert the model to GGUF.
			
 
				 
			
 
				+Personally, I prefer the second option as it very often also results in a smaller model, and does not require too much manual hacking about. You can even have a separate Conda environment just for `llama.cpp`.
			
 
				+
			
 
				 === Installing vLLM on macOS ===
			
 
				 
			
 
				-If you used the InstructLab env file provided in this repo, you should already have `torch` and `torchvision` modules in the environment. If not, ensure they are available.
			
 
				+If you used the InstructLab env file provided in this repo, you should already have `cmake`, `torch`, and `torchvision` modules in the environment. If not, ensure they are available.
			
 
				+
			
 
				+During the compilation, `pip` in particular may complain about some incompatibilities. Just ignore it.
			
 
				 
			
 
				 First, clone Triton and install it.
			
 
				 
			
@@ -136,11 +140,6 @@ Cloning into 'triton'...
 
				 
			
 
				 (ilab-25) $ *cd triton/python*
			
 
				 
			
 
				-(ilab-25) $ *pip install cmake*
			
 
				-Collecting cmake
			
 
				-...
			
 
				-Successfully installed cmake-4.0.0
			
 
				-
			
 
				 (ilab-25) $ *pip install -e .*
			
 
				 Obtaining file:///foo/bar/baz/triton/python
			
 
				 ...
			
@@ -152,6 +151,10 @@ Successfully installed triton-3.3.0+git32b42821
 
				 (ilab-25) $ *rm -rf ./triton/*
			
 
				 ----
			
 
				 
			
 
				+====
			
 
				+NOTE: Triton compilation takes quite a long time and it appears to be doing nothing. Don't worry.
			
 
				+====
			
 
				+
			
 
				 Clone vLLM and build it.
			
 
				 
			
 
				 [subs="+quotes"]
			
@@ -162,7 +165,7 @@ Cloning into 'vllm'...
 
				 
			
 
				 (ilab-25) $ *cd vllm*
			
 
				 
			
 
				-(ilab-25) $ *sed -i 's/^triton==3.2/triton==3.3/' requirements/requirements-cpu.txt
			
 
				+(ilab-25) $ *sed -i 's/^triton==3.2/triton==3.3/' requirements/requirements-cpu.txt*
			
 
				 (ilab-25) $ *pip install -e .*
			
 
				 Obtaining file:///foo/bar/baz/vllm
			
 
				 ...
			
@@ -174,6 +177,10 @@ Successfully installed vllm-0.8.5.dev3+g7cbfc1094.d20250414
 
				 (ilab-25) $ *rm -rf ./vllm/*
			
 
				 ----
			
 
				 
			
 
				+====
			
 
				+NOTE: vLLM 0.8.5 somehow imposes a restriction of maximum version of Triton being 3.2.0, which is not necessary.
			
 
				+====
			
 
				+
			
 
				 References:
			
 
				 
			
 
				 * https://github.com/triton-lang/triton[Triton Development Repository]
			
--- a/docs/JUPYTERLAB.adoc
+++ b/docs/JUPYTERLAB.adoc
@@ -55,7 +55,7 @@ If you want to reopen it at any later point, you can point your browser to `\htt
 
				 There are a couple of additional startup options you can use, the two most convenient are listed below.
			
 
				 
			
 
				 * `--no-browser`, do not open a browser tab/window
			
 
				-* `--notebook-dir=_path_`, where to load notebooks and kernels from
			
 
				+* `--notebook-dir=_path_`, where to load notebooks from
			
 
				 
			
 
				 You can also control Jupyter server with the `jupyter-server` command.
			
 
				 
			
@@ -63,7 +63,7 @@ You can also control Jupyter server with the `jupyter-server` command.
 
				 ----
			
 
				 (base) $ *jupyter-server list*
			
 
				 Currently running servers:
			
 
				-http://localhost:8888/?token=xxxx :: /foo/bar/machine-learning-local
			
 
				+http://localhost:8888/?token=xxxx :: /foo/bar/ml-demo
			
 
				 
			
 
				 (base) $ *jupyter-server stop*
			
 
				 [JupyterServerStopApp] Shutting down server on 8888...