Skip to content

dWGS Upload Workshop

This course is for users who want to learn how to use the CGPClient library to:

  • Upload Dragen runs
  • List available files
  • Download datasets

Orientation

Environment Setup

If you have not yet done so, please complete the Environment Setup before proceeding.

About This Workshop

This workshop provides a hands-on example of uploading files generated by DRAGEN software (version ≥4.0.0) after demultiplexing a complete sequencing run.

Generate Mock Data

We'll start by creating realistic mock data that mirrors actual DRAGEN output:

bash docs/training/generate_mock_run.sh

What This Creates

The script generates a complete run folder structure including:

  • Run folder: Named with a unique run ID (e.g., 250702_A00123_0001_CE956995D6)
  • fastq_list.csv: Metadata file listing all FASTQ files
  • RunInfo.xml: Basic run setup metadata
  • FASTQ files: Sample sequencing data files

Expected Directory Structure

After running the script, you should see a folder structure like this:

250702_A00123_0001_CE956995D6/
├── Data/
│   └── Fastq/
│       ├── 2510585/
│       │   ├── 2510585_S1_L001_R1_001.fastq.ora
│       │   └── 2510585_S1_L001_R2_001.fastq.ora
│       └── 2514204/
│           ├── 2514204_S2_L001_R1_001.fastq.ora
│           └── 2514204_S2_L001_R2_001.fastq.ora
├── fastq_list.csv
└── RunInfo.xml

Checkpoint

Before continuing: Verify that your mock data has been generated successfully by running:

ls -la 250702_A00123_0001_CE956995D6/


Part 1: Upload DRAGEN Run

Configure the Client

Create your configuration file:

nano config.yaml

Add the following configuration:

verbose: true
output_dir: /tmp/output
api_host: <TBC>
override_api_base_url: true
api_key: <TBC>
ods_code: <add_your_ods_code>

Action Required

Replace <add_your_ods_code> with your actual ODS code before proceeding, e.g. 8J834.

Explore the Upload Script

First, understand what options are available:

cgpclient/scripts/upload_dragen_run -h

Most arguments from the help output correspond to settings in your config.yaml file.

Prepare Your Upload

Before running the upload, you'll need unique IDs for this exercise:

  • Run ID: Use the generated folder name (e.g., 250702_A00123_0001_CE956995D6)
  • Patient ID: Create one starting with 'p' (e.g., p5678)
  • Referral ID: Create one starting with 'r' (e.g., r9012)

Execute the Upload

Run the upload command with your generated data:

cgpclient/scripts/upload_dragen_run \
  -f 250702_A00123_0001_CE956995D6/fastq_list.csv \
  -rif 250702_A00123_0001_CE956995D6/RunInfo.xml \
  -i 250702_A00123_0001_CE956995D6 \
  -p p5678 \
  -r r9012 \
  -cfg config.yaml

Expected Success Message

If successful, you should see: "Successfully posted FHIR resource"


Part 2: List Available Files

Now let's verify what files were uploaded successfully.

Explore the List Command

cgpclient/scripts/list_files -h

List Your Uploaded Files

cgpclient/scripts/list_files \
  -i 250702_A00123_0001_CE956995D6 \
  -p p5678 \
  -r r9012 \
  -cfg config.yaml

Expected Output

You should see output similar to this:

name                                   size  content_type     last_updated         author_ods_code    referral_id    participant_id      sample_id  run_id
-----------------------------------  ------  ---------------  -------------------  -----------------  -------------  ----------------  -----------  -----------------------------
2514204_S2_L001_R1_001.fastq.ora     355786  text/fastq       2025-07-02T14:58:30  8J834              r9012          p5678                 2514204  250702_A00123_0001_CE956995D6
2514204_S2_L001_R2_001.fastq.ora     355786  text/fastq       2025-07-02T14:58:30  8J834              r9012          p5678                 2514204  250702_A00123_0001_CE956995D6
RunInfo.xml                              547  application/xml  2025-07-02T14:58:30  8J834              r9012          p5678                 2514204  250702_A00123_0001_CE956995D6

Part 3: Download Files

Finally, let's download files from the platform.

Explore Download Options

cgpclient/scripts/download_file -h

Download a Specific File

Choose a file from your list output and download it:

cgpclient/scripts/download_file \
  -f 2514204_S2_L001_R1_001.fastq.ora \
  -cfg config.yaml

File Location

Downloaded files will be saved to the output_dir specified in your config.yaml (default: /tmp/output).

Verify the Download

Check that your file was downloaded successfully:

ls -la /tmp/output/

Interactive Exercise

Challenge: Try downloading another file from your list. Can you download the RunInfo.xml file as well?


Workshop Summary

Congratulations! You have completed the dWGS upload workflow:

  • Configured CGPClient with your credentials and settings
  • Uploaded a complete DRAGEN run with FASTQ files and metadata
  • Listed available files to verify the upload
  • Downloaded files from the platform