dWGS Upload Workshop¶
This course is for users who want to learn how to use the CGPClient library to:
- Upload Dragen runs
- List available files
- Download datasets
Orientation¶
Environment Setup¶
If you have not yet done so, please complete the Environment Setup before proceeding.
About This Workshop¶
This workshop provides a hands-on example of uploading files generated by DRAGEN software (version ≥4.0.0) after demultiplexing a complete sequencing run.
Generate Mock Data¶
We'll start by creating realistic mock data that mirrors actual DRAGEN output:
bash docs/training/generate_mock_run.sh
What This Creates
The script generates a complete run folder structure including:
- Run folder: Named with a unique run ID (e.g.,
250702_A00123_0001_CE956995D6
) - fastq_list.csv: Metadata file listing all FASTQ files
- RunInfo.xml: Basic run setup metadata
- FASTQ files: Sample sequencing data files
Expected Directory Structure¶
After running the script, you should see a folder structure like this:
250702_A00123_0001_CE956995D6/
├── Data/
│ └── Fastq/
│ ├── 2510585/
│ │ ├── 2510585_S1_L001_R1_001.fastq.ora
│ │ └── 2510585_S1_L001_R2_001.fastq.ora
│ └── 2514204/
│ ├── 2514204_S2_L001_R1_001.fastq.ora
│ └── 2514204_S2_L001_R2_001.fastq.ora
├── fastq_list.csv
└── RunInfo.xml
Checkpoint
Before continuing: Verify that your mock data has been generated successfully by running:
ls -la 250702_A00123_0001_CE956995D6/
Part 1: Upload DRAGEN Run¶
Configure the Client¶
Create your configuration file:
nano config.yaml
Add the following configuration:
verbose: true
output_dir: /tmp/output
api_host: <TBC>
override_api_base_url: true
api_key: <TBC>
ods_code: <add_your_ods_code>
Action Required
Replace <add_your_ods_code>
with your actual ODS code before proceeding, e.g. 8J834.
Explore the Upload Script¶
First, understand what options are available:
cgpclient/scripts/upload_dragen_run -h
Most arguments from the help output correspond to settings in your config.yaml
file.
Prepare Your Upload¶
Before running the upload, you'll need unique IDs for this exercise:
- Run ID: Use the generated folder name (e.g.,
250702_A00123_0001_CE956995D6
) - Patient ID: Create one starting with 'p' (e.g.,
p5678
) - Referral ID: Create one starting with 'r' (e.g.,
r9012
)
Execute the Upload¶
Run the upload command with your generated data:
cgpclient/scripts/upload_dragen_run \
-f 250702_A00123_0001_CE956995D6/fastq_list.csv \
-rif 250702_A00123_0001_CE956995D6/RunInfo.xml \
-i 250702_A00123_0001_CE956995D6 \
-p p5678 \
-r r9012 \
-cfg config.yaml
Expected Success Message
If successful, you should see: "Successfully posted FHIR resource"
Part 2: List Available Files¶
Now let's verify what files were uploaded successfully.
Explore the List Command¶
cgpclient/scripts/list_files -h
List Your Uploaded Files¶
cgpclient/scripts/list_files \
-i 250702_A00123_0001_CE956995D6 \
-p p5678 \
-r r9012 \
-cfg config.yaml
Expected Output¶
You should see output similar to this:
name size content_type last_updated author_ods_code referral_id participant_id sample_id run_id
----------------------------------- ------ --------------- ------------------- ----------------- ------------- ---------------- ----------- -----------------------------
2514204_S2_L001_R1_001.fastq.ora 355786 text/fastq 2025-07-02T14:58:30 8J834 r9012 p5678 2514204 250702_A00123_0001_CE956995D6
2514204_S2_L001_R2_001.fastq.ora 355786 text/fastq 2025-07-02T14:58:30 8J834 r9012 p5678 2514204 250702_A00123_0001_CE956995D6
RunInfo.xml 547 application/xml 2025-07-02T14:58:30 8J834 r9012 p5678 2514204 250702_A00123_0001_CE956995D6
Part 3: Download Files¶
Finally, let's download files from the platform.
Explore Download Options¶
cgpclient/scripts/download_file -h
Download a Specific File¶
Choose a file from your list output and download it:
cgpclient/scripts/download_file \
-f 2514204_S2_L001_R1_001.fastq.ora \
-cfg config.yaml
File Location
Downloaded files will be saved to the output_dir
specified in your config.yaml
(default: /tmp/output
).
Verify the Download¶
Check that your file was downloaded successfully:
ls -la /tmp/output/
Interactive Exercise
Challenge: Try downloading another file from your list. Can you download the RunInfo.xml file as well?
Workshop Summary¶
Congratulations! You have completed the dWGS upload workflow:
- Configured CGPClient with your credentials and settings
- Uploaded a complete DRAGEN run with FASTQ files and metadata
- Listed available files to verify the upload
- Downloaded files from the platform