-
Notifications
You must be signed in to change notification settings - Fork 520
Docs: Add extended quickstart and installation guides (release + source) #2388
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
janniklinde
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @yiseungmi87 for the good first PR, it seems to be quite clear and understandable so far.
I did not manage to set up SystemDS for Ubuntu by only following your guide (which should be the goal of the install guide) so please have a look into that. You can use a clean docker image to follow your guide and identify possible points of failure. Similarly, please check that for the other operating systems no such weak points exist (if you have windows, maybe try the setup on a new user). Also, I realized that when cloning SystemDS source code via GitHub Desktop on Windows, it might get stuck in the cloning process so we should provide a solution for that (e.g. use 'git' CLI for cloning rather than the app). So far, I have not tested the install for Windows / macOS but will do so once my current comments are resolved.
|
|
||
| Download the official release archive from the Apache SystemDS website: | ||
|
|
||
| https://apache.org/dyn/closer.lua/systemds/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rather point to https://systemds.apache.org/download
|
|
||
| ### 3.1 Extract the Release | ||
|
|
||
| ```bash | ||
| cd /path/to/install | ||
| tar -xvf systemds-<VERSION>.tar.gz | ||
| cd systemds-<VERSION> | ||
| ``` | ||
|
|
||
| ### 3.2 Add SystemDS to PATH | ||
|
|
||
| ```bash | ||
| export SYSTEMDS_ROOT=$(pwd) | ||
| export PATH="$SYSTEMDS_ROOT/bin:$PATH" | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to follow the guide for ubuntu 22.04 (I set up a fresh docker image with java, tar and wget installed). After downloading and extracting the release, I got stuck with this error.
Docker image I tested on:
FROM ubuntu:22.04
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && \
apt-get install -y \
openjdk-17-jdk \
ca-certificates \
wget \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /opt
RUN wget https://dlcdn.apache.org/systemds/3.3.0/systemds-3.3.0-bin.tgz && \
tar -xzf systemds-3.3.0-bin.tgz && \
rm systemds-3.3.0-bin.tgz
CMD ["bash"]
root@9385e1a25ddd:/opt# ls
systemds-3.3.0-bin
root@9385e1a25ddd:/opt# cd systemds-3.3.0-bin
root@9385e1a25ddd:/opt/systemds-3.3.0-bin# java -version
openjdk version "17.0.17" 2025-10-21
OpenJDK Runtime Environment (build 17.0.17+10-Ubuntu-122.04)
OpenJDK 64-Bit Server VM (build 17.0.17+10-Ubuntu-122.04, mixed mode, sharing)
root@9385e1a25ddd:/opt/systemds-3.3.0-bin# export SYSTEMDS_ROOT=$(pwd)
root@9385e1a25ddd:/opt/systemds-3.3.0-bin# export PATH="$SYSTEMDS_ROOT/bin:$PATH"
root@9385e1a25ddd:/opt/systemds-3.3.0-bin# systemds -help
Help requested. Will exit after extended usage message!
Usage: /opt/systemds-3.3.0-bin/bin/systemds [-r] [SystemDS.jar] [-f] <dml-filename> [arguments] [-help]
SystemDS.jar : Specify a custom SystemDS.jar file (this will be prepended
to the classpath
or fed to spark-submit
-r : Spawn a debug server for remote debugging (standalone and
spark driver only atm). Default port is 8787 - change within
this script if necessary. See SystemDS documentation on how
to attach a remote debugger.
-f : Optional prefix to the dml-filename for consistency with
previous behavior dml-filename : The script file to run.
This is mandatory unless running as a federated worker
(see below).
arguments : The arguments specified after the DML script are passed to
SystemDS. Specify parameters that need to go to
java/spark-submit by editing this run script.
-help : Print this usage message and SystemDS parameter info
Worker Usage: /opt/systemds-3.3.0-bin/bin/systemds [-r] WORKER [SystemDS.jar] <portnumber> [arguments] [-help]
port : The port to open for the federated worker.
Federated Monitoring Usage: /opt/systemds-3.3.0-bin/bin/systemds [-r] FEDMONITORING [SystemDS.jar] <portnumber> [arguments] [-help]
port : The port to open for the federated monitoring tool.
Set custom launch configuration by setting/editing SYSTEMDS_STANDALONE_OPTS
and/or SYSTEMDS_DISTRIBUTED_OPTS.
Set the environment variable SYSDS_DISTRIBUTED=1 to run spark-submit instead of
local java Set SYSDS_QUIET=1 to omit extra information printed by this run
script.
----------------------------------------------------------------------
Further help on SystemDS arguments:
Error: Unable to access jarfile org.apache.sysds.api.DMLScript
root@9385e1a25ddd:/opt/systemds-3.3.0-bin# cd ..
root@9385e1a25ddd:/opt# echo 'print("Hello World!")' > hello.dml
root@9385e1a25ddd:/opt# systemds -f hello.dml
###############################################################################
# SYSTEMDS_ROOT= /opt/systemds-3.3.0-bin
# SYSTEMDS_JAR_FILE=
# SYSDS_EXEC_MODE= singlenode
# CONFIG_FILE= -config /opt/systemds-3.3.0-bin/conf/SystemDS-config.xml
# LOG4JPROP= -Dlog4j.configuration=file:/opt/systemds-3.3.0-bin/conf/log4j.properties
# HADOOP_HOME= /opt/systemds-3.3.0-bin/lib/hadoop
#
# Running script hello.dml locally with opts:
# Executing command: java -Xmx4g -Xms4g -Xmn400m -Dlog4j.configuration=file:/opt/systemds-3.3.0-bin/conf/log4j.properties -jar -f hello.dml -exec singlenode -config /opt/systemds-3.3.0-bin/conf/SystemDS-config.xml
###############################################################################
Error: Invalid or corrupt jarfile hello.dml
| It can be beneficial to enter these into your `~/.profile` or `~/.bashrc` for linux, | ||
| (but remember to change `$(pwd` to the full folder path) | ||
| or your environment variables in windows to enable reuse between terminals and restarts. | ||
|
|
||
| ```bash | ||
| echo 'export SYSTEMDS_ROOT='$(pwd) >> ~/.bashrc | ||
| echo 'export PATH=$SYSTEMDS_ROOT/bin:$PATH' >> ~/.bashrc | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would mention that in release_install as well. Otherwise, after restarting the terminal people might get confused when only following quickstart. Also, for prerequisites that are already mentioned in the install, guides reference them rather than repeating the same thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to mention that you can also add the bin folder to PATH. Then you can directly access your last local build through CLI.
This PR introduces improved documentation for new users of SystemDS:
Added
quickstart_extended.md- Overview page linking installation and execution docsrelease_install.md- Clean, updated installation guide for release userssource_install.md- Updated guide for building SystemDS from sourcerun_extended.md- Comprehensive execution guide (local, Spark, federated)run.md- Slightly modifiedScope
Purpose
These changes provide clearer onboarding for new SystemDS users and consolidate documentation into a consistent structure.
Let me know if adjustments are desired before merging.