Running Tasks
LlamaFactory Adapter Plugin
Configuration flow
Before running a training task, configure these environment variables in bash:
| Environment variable | Description |
|---|---|
ECO_GRPC_ADDR | Server address |
ECO_CLIENT_ID | Unique user identifier [ID] |
ECO_API_KEY | API authentication value [PASSWORD] |
ECO_TLS_ROOT_CA | TLS root certificate path |
Example:
export ECO_GRPC_ADDR="121.41.XXX.XX:80"
export ECO_CLIENT_ID="User_XX"
export ECO_API_KEY="XXXX"
export ECO_TLS_ROOT_CA="xx/rootCA.pem"
Usage example
Replace these values based on your environment:
- Server address
- Certificate path
- Data path
- Model path
- Configuration file path
Example command:
export ECO_GRPC_ADDR="121.41.XXX.XX:80" && \
export ECO_CLIENT_ID="User_XX" && \
export ECO_API_KEY="XXXX" && \
export ECO_TLS_ROOT_CA="/root/rootCA.pem" && \
CUDA_VISIBLE_DEVICES=0,1 accelerate launch --config_file fsdp_config.yaml \
--main_process_port 29501 src/train.py emotion_rec_sft_full_eco.yaml
Plugin log reference
(1) Plugin imported and initialized successfully
The log includes:
[EcoPhase] EcoMonitor initialized.
(2) Plugin enabled
The log includes:
[EcoPhase] API is enabled.
(3) Plugin inactive
The log includes:
[EcoPhase] API is disabled.
(4) Early stop triggered
The system automatically saves the model and prints a training summary, for example:
Task early stopped at step 200/2000. Reduction: 90.0%. Saved GPU-hours: 1.03.
This means:
| Field | Meaning |
|---|---|
200/2000 | The task stopped early at step 200 out of the planned 2000 steps |
Reduction: 90.0% | Training steps were reduced by about 90.0% |
Saved GPU-hours: 1.03 | Estimated savings of 1.03 GPU-hours |
Notes
- Environment variables must be configured correctly and must not contain extra spaces.
- The root certificate must exist and be valid.
- The plugin initialization code must be inserted at the correct position in
trainer.py. - Confirm this before running:
enabled=True
- Check these states first in the logs:
API is enabled
API is disabled
- The plugin currently supports data parallelism only. Other parallel modes are not supported yet.