Content from Introduction


Last updated on 2025-03-11 | Edit this page

Estimated time: 13 minutes

Overview

Questions

  • Why QGIS?
  • How to start QGIS on HPC Clusters?
  • How to load and visualize data in QGIS?
  • How to process and export data in QGIS?

Objectives

  • Explain why we should use QGIS
  • Demonstrate how to start QGIS on HPC Clusters
  • Demonstrate how to load and visualize data in QGIS
  • Demonstrate how to process and export data in QGIS

Introduction to QGIS


GIS stands for ‘Geographical Information System’. We can use a GIS application, such as ArcGIS and QGIS to manipulate spatial information.

QGIS is a free and open-source software that runs on various operating systems. It offers a wide range of functionality, such as vector and raster analysis, geoprocessing, geocoding, georeferencing, web mapping, and 3D visualization. QGIS also supports many data formats and standards, such as Shapefile, GeoTIFF, GeoJSON, WMS, WFS, and PostGIS.

Why QGIS? It’s free and flexible.

  1. Cost-free: Enjoy QGIS without any financial burden. It’s completely free, no hidden fees.
  2. Free as in “Do It Your Way”: You could extend QGIS to meet your specific needs, sponsor development or contribute your own code.
  3. Works where you work: Run QGIS on macOS, Windows, and Linux (so available for HPC Clusters).
  4. Always getting better: Benefit from rapid development because anyone can add new features and improve on existing ones.
  5. Never get stuck: Access extensive documentation and a large and active supportive community is there for help.
  6. Easily integrate Artificial Intelligence (AI) and GeoAI.

Open QGIS on Clusters


We can start QGIS via ThinLinc Client or Gateway.

1. Start QGIS via ThinLinc

Follow up with the Setup page, and connect with ThinLinc. There are two ways to start QGIS via ThinLinc. The two ways are fundamentally the same but one is interactive, and the other is typing code.

(1) Interactive Way

To open QGIS as an interactive job, we could go to “Cluster Software” and select “QGIS”, as the figure below:

Picture1

Then select the “workshop” queue as below:

Screenshot 2025-02-24 at 4 15 16 PM

Then hit “No”:

Screenshot 2025-02-24 at 4 15 41 PM

Now input two cores and five minutes and hit Okay. You don’t need to specifically request memory because memory will be relocated proportional with cores. But if you do, include unit such as “4G”.

Screenshot 2025-02-24 at 4 16 10 PM
(2) Typing Code Way

We could start the Terminal as below:

Screenshot 2025-02-25 at 9 51 03 AM

To look up QGIS module, we could do:

SH

module spider qgis

To start an interactive job and open QGIS:

SH

sinteractive -A workshop -N1 -c8 -t8:00:00
module load qgis
qgis

2. Start QGIS via Gateway

Gateway, also named Open OnDemand, is a Web interface includes file explorer, interactive apps including QGIS.​ We have to use our own accounts to login Gateway, not the training accounts we used for this workshop. Go to Negishi Gateway, login with our purdue accounts (when we have account on Clusters) and connect QGIS as the figure below.

Screenshot 2025-02-25 at 10 05 05 AM

Callout

We could also open QGIS with Gateway, in the Typing Code Way. We could start a terminal as below.

Start Terminal in Gateway
Start Terminal in Gateway

View Spatial Data


In Geographic Information Systems (GIS), data is primarily represented in two fundamental formats: vector and raster.

Vector Data

  • Representation:
    • Vector data uses geometric objects—points, lines, and polygons—to represent spatial features.
    • Points represent individual locations (e.g., a city, a tree).
    • Lines represent linear features (e.g., roads, rivers).
    • Polygons represent areas (e.g., lakes, buildings, administrative boundaries).  
  • Characteristics:
    • Precision: Vector data is excellent for representing discrete features with clear boundaries, offering high precision.
    • Scalability: Vector data can be scaled up or down without losing quality.
    • Data Storage: Typically, vector data requires less storage space than raster data for representing discrete features.
    • Use Cases: Best suited for representing features with distinct boundaries, such as roads, property lines, and political boundaries.
  • Vector File Types:
    • Shapefile (.shp): A very common geospatial vector data format for GIS software. It actually consists of several files (.shp, .shx, .dbf, etc.)
    • GeoJSON (.geojson): A popular open standard format that uses JavaScript Object Notation (JSON) to represent geographic features.
    • KML/KMZ: Used by Google Earth for displaying geographic data.
    • File format is handled by GDAL/OGR package with a full list

Raster Data

  • Representation:
    • Raster data represents spatial information as a grid of cells (pixels). Each cell contains a value representing a specific attribute (e.g., elevation, temperature, land cover).  
  • Characteristics:
    • Continuous Data: Raster data is ideal for representing continuous attributes, such as elevation, temperature, and satellite imagery.
    • Data Storage: Raster data can require significant storage space, especially at high resolutions.
    • Analysis: Raster data is well-suited for spatial analysis involving calculations and modeling.
    • Use Cases: Best suited for representing continuous surfaces, such as elevation models, satellite imagery, and aerial photographs.
  • Raster File Types:
    • TIFF: Basic image format, no geographic information.
    • GeoTIFF: A TIFF file with added geospatial metadata, enabling it to be used in GIS applications.
    • COG (Cloud Optimized GeoTIFF): A type of GeoTIFF with a specific data structure optimized for fast access in cloud environments, often using tiled data storage.
    • File format is handled by GDAL/OGR package with a full list

Key Differences

  • Structure: Vector data uses geometric shapes, while raster data uses a grid of cells.
  • Data Type: Vector is for discrete features, raster is for continuous phenomena.
  • Precision: Vector is generally more precise, while raster’s precision depends on cell size.
  • Storage: Vector often uses less storage for discrete features. Raster data storage size is heavily dependant on resolution.

Load Spatial Data

(1) Load vector data from files
  • Step1: Layer -> Add Layer -> Add Vector Layer
Picture2
  • Step2: Input your path of “alaska.shp” and hit “add”
  • Step 3: You will see the shapefile has been added to Layers as below.

Challenge 1: Try yourself

try yourself to add airports.shp to layers.

Picture3

Callout

When adding a data source, QGIS attempts to identify its Coordinate Reference System (CRS) from sources like a shapefile’s .prj file. If no CRS information is found, QGIS prompts you to specify it. You can modify this behavior in Settings -> Options -> CRS to automatically assign either the project’s CRS or a designated default CRS. (Graser et al., 2017)

(2) Load CSV files
  • Step 1: Layer -> Add Layer -> Add Delimited Text Layer 
  • Step 2: Make changes as the red box in the picture below.
Screenshot 2025-02-25 at 2 41 23 PM
  • Step 3: You will see the shapefile has been added to Layers as below.
Picture4
(3) Load Raster files
  • Step1: Layer -> Add Layer -> Add Raster Layer
  • Step2: Input your path of “landcover.img” and hit “add”
  • Step 3: You will see the raster has been added to Layers as below.
Picture5

Challenge 1: Try yourself

try yourself to add SR_50M_alaska_nad.tif (Hillshade GeoTiff) to layers.

Challenge 2: A Question?

Did you find anything weird about the Hillshade image showing above?

Yep, the hillshade should be in the North instead of the South. Let’s check data Properties and change looking.

Process Spatial Data


Filter Vector Data

Scenario: My Grandma wants to have a trip in Alaska but doctor said she shouldn’t go to places with high elevation due to the heart problem. So I will find airports with low elevantion for her. For example, I found the airport with elevation lower than 1000 ft.

Solution:

  • Step1: Right click the data “airports” and select “Filter”
  • Step2: Select “ELEV” from “Fields”, and input the Specific Filter Expression as below:
  • Step3: Hit “OK” and now only airport with elevation lower than 1000 ft show up.
  • Step4-Export data: right click the data and selelct “Export” -> “Save Features As”. Input information as figure below and hit “OK”.

Raster Calculation

Scenario: I’d like find some sunny slope where my Grandma and I can go skiing. So I found the places where Hillshade is smaller than 100, for example.

Solution:

  • Step1: Turn on the Processing Toolbox if it’s off, and Search “Raster Calculator”.

  • Step2: Input the “Input Layers”, “Expression”, “Output CRS”, and “Calculated” as the Output file.
  • Step3: Hit “Run” and change the looking of output.
  • You have already written out the output file. But if you didn’t, you could always export the data via right clicking it and select “Export” -> “Save As”. Then input information as figure below and hit “OK”.

Key Points

  • QGIS is a free and open-source software that runs on various platforms, such as Windows, Mac, and Linux.
  • QGIS has a large and active community of users and developers who contribute to its features, plugins, documentation, and support.
  • We can use QGIS via ThinLinc Client or Gateway on HPC Clusters.
  • We learned how to load and visualize vector and raster data.
  • We learned how to process data and export them.

Content from Project


Last updated on 2025-03-13 | Edit this page

Estimated time: 12 minutes

Overview

Questions

  • What is DEM?
  • What types of downstream information can be derived from a DEM?
  • What is the topographic landscape around Purdue?
  • What are the hydrological features around Purdue?

Objectives

  • Define and explain the preliminary knowledge.
  • Show how to install plugins with QGIS.
  • Identify the types of downstream information that can be derived from a DEM.
  • Demonstrate how to analyze topographic landscape and hydrological features with QGIS.
  • Explain the process of converting raster data to vector data.
  • Demonstrate how to run multiple instances in one shot.

Preliminary Knowledge


Knowledge Graph
Knowledge Graph

1. DEM (Digital Elevation Model):

A DEM is essentially a 3D representation of a terrain’s surface. It’s a raster dataset, meaning it’s made up of a grid of cells (pixels), where each cell contains an elevation value.

  • Think of it as a digital map that shows the height of the land at every point.
  • DEMs are the foundation for creating slope, hillshade maps and analyzing hydrology.
  • The DEM we used today is LiDAR (Light Detection and Ranging) data with one meter resolution from USGS, which uses laser pulses from aircraft or drones to measure terrain elevation accurately.

2. Slope:

Slope refers to the steepness or gradient of the terrain. It’s typically expressed in degrees (0 to 90) or as a percentage. A slope of 0 degrees indicates a flat surface, while a slope of 90 degrees represents a vertical surface.

Slope maps are derived from DEMs and are useful for analyzing things like:

  • Erosion risk
  • Water runoff
  • Suitability for construction

3. Hillshade:

Hillshade refers to simulating how sunlight would illuminate the terrain. It enhances the visualization of landforms by showing how light and shadow interact with the surface. The hillshade effect is created by setting an azimuth and vertical angle for a light source. The Azimuth is the direction the light is coming from, and the vertical angle is the angle of the light source above the horizon.

Hillshade maps are also derived from DEMs and are valuable for:

  • Visualizing terrain features and creating visually appealing maps
  • Identifying subtle topographic details

4. Analyzing hydrology

The workflow: DEM Input → Fill Sinks → Flow Direction → Flow Accumulation → Stream Extraction -> Subbasins

  • Before calculating flow, the DEM needs to be preprocessed to remove any small depressions (sinks) that may trap water. These are often errors in the data or natural terrain features that disrupt continuous flow.
  • The flow direction represents the direction in which water will move from each cell based on elevation. The most common method assigns flow using the D8 algorithm, where water moves to the steepest downward slope among the eight neighboring cells.
  • The flow accumulation raster represents the number of upstream cells contributing flow to each cell. This helps identify areas with higher accumulated flow, which can indicate potential stream paths.
    • Low accumulation values: Represent hilltops or ridges.
    • High accumulation values: Represent valleys or stream networks.
  • Streams are extracted by applying a threshold to the flow accumulation raster. A threshold defines the minimum number of contributing cells required to form a stream.
    • Higher thresholds result in fewer, larger streams.
    • Lower thresholds detect smaller streams, but may introduce noise.
  • Last step is to extract Watersheds/Subbasins, an area of land that drains all precipitation and surface runoff to a common outlet(such as a river, lake, or ocean).
    • It is defined by topographic divides (ridges or hills) that separate it from adjacent watersheds.

    • Watersheds can be small (for example, a stream watershed) or large (covering multiple rivers).

    • Same concept as Catchments and Basins, but they are in different scales:

      Basin → contains multiple → Sub-Basins → made up of → Watersheds → includes smaller → Catchments

5. Wetness Index

The Wetness Index, often referred to as the Topographic Wetness Index (TWI), is a terrain attribute that quantifies the propensity of a location to accumulate water. It’s a crucial tool in hydrological and ecological modeling, providing insights into soil moisture, runoff potential, and habitat distribution.

  • The TWI is based on the principle that areas with a large upslope contributing area and a shallow slope are more likely to accumulate water.

  • It combines information about the upslope contributing area (the area that drains into a given point) and the slope gradient.

  • \(TWI = ln(\dfrac{\alpha}{tan\beta})\)

    where \(\alpha\) is specific catchment area (i.e. the upslope contributing area per unit contour length), \(\beta\) is the local slope angle.

Load Data

Let’s first load the two rasters in data directory.

Screenshot 2025-03-04 at 2 21 14 PM

Install Plugins


Install QuickMapServices

QuickMapServices (QMS) is a valuable plugin for QGIS that simplifies the process of adding and using a vast array of online basemaps and web mapping services.

  • Go to Plugins -> Manage and Install Plugins
  • Search “QuickMapSevices” and click “Install Plugin”
  • Now click “Search QMS” and search “Google Maps”, as below.
Screenshot 2025-03-04 at 12 11 27 PM
Screenshot 2025-03-04 at 12 11 54 PM
Picture6

Install WhiteboxTools

WhiteboxTools is a powerful, open-source geospatial data analysis platform that offers a wide range of geospatial analysis tasks including DEM analysis, hydrological modeling and more.

  • Step 1: Install WhiteboxTools for QGIS

    • go to Plugins -> Manage and Install Plugins
    • search “WhiteboxTools for QGIS” and click “Install Plugin”
  • Step 2: Install Whitebox Executables

    • option 1: go to WhiteboxTools and download.
    • option 2: run the command below:

    SH

    wget https://www.whiteboxgeo.com/WBT_Linux/WhiteboxTools_linux_amd64.zip
  • Step 3: Unzip it by running

    SH

    unzip WhiteboxTools_linux_amd64.zip
  • Step 4: Configurate it

    • go to Settings -> options -> Processing -> providers -> WhiteboxTools

    • go to WhiteboxTools executable and double click the content and click “…” to set it as “whitebox_tools” in your unzipped directory(the executable).

      your executable should be as below (you can also copy and paste):

      SH

      /home/trainXX/WhiteboxTools_linux_amd64/WBT/whitebox_tools
      Screenshot 2025-03-04 at 11 57 25 AM
  • Step 5: Now you could see WhiteboxTools show up at the bottom of the Processing Toolbox as below.

    Screenshot 2025-03-04 at 12 07 08 PM

Merge DEM Data


  • Search “merge” in Processing Toolbox as below

    image
  • Select the two rasters as input

image
  • Type the output name in your scratch directory and hit “Run”.
Output file will be added to layers if you check “Open outputfile after running algorithm”
Output file will be added to layers if you check “Open outputfile after running algorithm”
  • Now you could merged DEM in your Layers as below
image

Calculate Slope


Search “Slope” in Processing Toolbox, select the elevation layer, type the output file, and hit “Run” as below.

Callout

Spend two minutes to explore the slope where your are familar and see if it makes sense. For example, I used it to find a nice slope to snow sled.

image

Calculate Hillshade


Search “Hillshade” in Processing Toolbox, and input the elevation layer, parameters, and output as the figure below.

We use the default settings here that typically place the light source in the northwest. It is a common convention in cartography for visualizing topography. (Vertical Views | GEOG 486: Cartography and Visualization, n.d.)

Analyze Hydrology


(1) Fill Sinks

  • Step 1: type a few words to search “FillDepressionsWangAndLiu” as below

  • Step 2: select dem, type output file, and hit “Run”

  • Step 3: hit close after the progress reach 100%, as below

(2) Flow Direction

  • Step 1: search “d8” to find “D8Pointer” in the Processing Toolbox.
  • Step 2: select “fillsinks” outputed from above procedure as the input DEM file and type the output as below
  • Step 3: hit close after the progress reach 100%; you will see the flowdirection layer has been added to Layers.

(3) Flow Accumulation

  • Step 1: type a few words to search “D8FlowAccumulation” in the Processing Toolbox.
  • Step 2: select “flowdirec” outputed from above procedure as the input and type output
check “Is input the D8 flow pointer”
check “Is input the D8 flow pointer”
  • Step 3: hit close after the progress reach 100%; you will see the flowaccum layer has been added to Layers.

(4) Extract Streams

  • Step 1: search “ExtractStreams” in the Processing Toolbox.

  • Step 2: select “flowaccum” outputed from above procedure as the input and type output, set the Channelization Threshold as 1,000,000 as below

  • Step 3: hit close after the progress reach 100%; you will see the stream10_6 layer has been added to Layers.

Callout

To better see the streams, you could uncheck other layers except the google map layer, and zoom in to compare the generated streams with the ones on the map.

Let’s convert it to vector: Go to Raster -> Conversion -> Polygonize (Raster to Vector), as below

you can choose different file format of vector
you can choose different file format of vector

Try different Channelization Threshold and compared the extracted streams.

(5) Watershed

  • Step 1: search “Subbasins” in the Processing Toolbox.
  • Step 2: select “flowdirect” as the D8 Pointer File, stream10_6 as the InputStreams File and type output, as below

Callout

  • You could also click “Run as Batch Process” and run multiple instances in one shot, as below

All algorithms (including models) can be executed as a batch process. That is, they can be executed using not just a single set of inputs, but several of them, executing the algorithm as many times as needed. This is useful when processing large amounts of data, since it is not necessary to launch the algorithm many times from the toolbox.

Calculate Topographic Wetness Index


  • Step 1: search “WetnessIndex” in the Processing Toolbox.

  • Step 2: select “subbasins10_6” calculated from the above procedure as the SCA File, slope as the Input Slope File and type output, as below

  • Step 3: After the calculation finishs, right click the output Layer, and select “Properties” and set as below

Output

the smaller Channelization Threshold is, the more channels you generated, and the high resolution you got (As below shows more accurate Wetness Index, with Channelization Threshold, 1000)

Key Points

  • DEMs are the foundation for creating slope, hillshade maps and analyzing hydrology.
  • Install plugins with QGIS, for example, QuickMapServices (QMS) offers a variety of base maps and WhiteboxTools offers a wide range of geospatial analysis tasks.
  • Demonstrate how to analyze topographic landscape and hydrological features with QGIS.
  • We can choose different file format to vectorize raster data.
  • “Run as Batch Process” can run multiple instances in one shot.

Content from Coding


Last updated on 2025-03-12 | Edit this page

Estimated time: 12 minutes

Overview

Questions

  • What is QGIS Python Console?
  • How to automate geospatial tasks with QGIS?

Objectives

  • Demonstrate how to use QGIS Python Console.
  • Demonstrate how to automate QGIS tasks using Python coding and processing from the command line.
  • Demonstrate how to monitor CPU and memory usage.

QGIS Python Console


  • The QGIS Python Console is an interactive shell for Python command executions.

  • It also has a Python file editor that allows you to edit and save your Python scripts.

  • Modules from QGIS (analysis, core, gui, server, processing, 3d) and Qt (QtCore, QtGui, QtNetwork, QtWidgets, QtXml) as well as Python’s math, os, re and sys modules are already imported and can be used directly (25.3. QGIS Python Console — QGIS Documentation  Documentation, n.d.).

    The screenshots below show how to open QGIS Python Console, and how it looks like.

Screenshot 2025-03-04 at 10 58 51 AM
Screenshot 2025-03-04 at 11 00 04 AM

Let’s create an output directory through Terminal, you should replace trainXX with your training account.

SH

mkdir -p /scratch/negishi/trainXX/project/code_output

Let’s copy the code below to the python console.

PYTHON

# change work directory
path ="/scratch/negishi/trainXX/project/code_output"
os.chdir(path)

Load Data

You can load data in two ways, depending on if you want it on the QGIS map canvas. Choose the first method if you only require computation; select the second if visualization is also needed.

The first way

There are two steps to load data.

  • Step 1: data is loaded in memory.

  • Step 2: the data is added to QGIS map canvas. This is not necessary if you just do computation and do not visualize data.

(1) Load Vector Data

PYTHON

vectorfile ="/scratch/negishi/liu4201/project/output/stream10_6.geojson"

# The format is:
# vlayer = QgsVectorLayer(data_source, layer_name, provider_name)

vlayer = QgsVectorLayer(vectorfile, "stream10_6", "ogr")

When you create a QgsVectorLayer or QgsRasterLayer object, it’s just a layer in memory. It doesn’t automatically get added to the QGIS map canvas. You have more control over the layer’s properties and how it’s loaded. You can even choose to not load it. You can also set things like the layer’s CRS, geometry type, and other options before adding it to the map.

PYTHON

if not vlayer.isValid():
    print("Layer failed to load!")
else:
    QgsProject.instance().addMapLayer(vlayer) # load the layer

Now it gets added to the QGIS map canvas.

(2) Load Raster Data

PYTHON

# use the firt way to load rasterfile 1
# load raster to memory
path_to_tif = rasterpath + "USGS_1M_16_x50y448_IN_Indiana_Statewide_LiDAR_2017_B17.tif"
rlayer = QgsRasterLayer(path_to_tif, "dem")

PYTHON

# added raster to the QGIS map canvas
if not rlayer.isValid():
    print("Layer failed to load!")
else:
    QgsProject.instance().addMapLayer(rlayer) # load the layer

The second way

It loads a vector or raster layer and immediately adds it to the map canvas

(1) Load Vector Data

PYTHON

vlayer = iface.addVectorLayer(vectorfile, "stream10_6", "ogr")
if not vlayer:
  print("Layer failed to load!")
(2) Load Raster Data

PYTHON

#use the second way to load rasterfile 2
path_to_tif2 = rasterpath + "USGS_1M_16_x50y449_IN_Indiana_Statewide_LiDAR_2017_B17.tif"   
iface.addRasterLayer(path_to_tif2, "dem2")

Merge DEM Data

PYTHON

rlayer = QgsRasterLayer(path_to_tif, "dem")
rlayer2 = QgsRasterLayer(path_to_tif2, "dem2")

# Merge rasters
processing.run("gdal:merge", {"INPUT": [rlayer, rlayer2], "OUTPUT": "merged_dem.tif"})

#check the merged_dem
iface.addRasterLayer("merged_dem.tif","merged_dem")

Calculate Slope

PYTHON

# Calcualting slope
processing.run("qgis:slope", {"INPUT": "merged_dem.tif", "OUTPUT": "slope.tif"})

#check it 
iface.addRasterLayer("slope.tif","slope")

You may wonder how to easily write the python code to perform tasks. Well, QGIS provide a smart way.

Step 1: Search the processing in the Processing Toolbox with QGIS interface, and fill in all the parameters.

Step 2: Click “Advanced”, and select “Copy as Python Command”.

Step 3: Paste it to run in the python console.

Calculate Hillshade

PYTHON

processing.run("native:hillshade", {'INPUT':'merged_dem.tif','Z_FACTOR':1,'AZIMUTH':300,'V_ANGLE':40,'OUTPUT':'hillshade.tif'})

Using processing from the command line


QGIS includes the Processing Executor, a command-line tool that enables the execution of QGIS processing algorithms and models (23.8. Using Processing From the Command Line — QGIS Documentation  Documentation, n.d.), including those from plugins, outside of the QGIS Desktop interface.

To use it, Let first close QGIS application, and then we will submit a job to HPC Cluster.

  • Step 1: Let’s start a new file named “myhydrojob” and copy in the code below.

SH


#!/bin/sh -l
# FILENAME:  myjob10_4

#SBATCH -A workshop
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8 
#SBATCH --time=00:10:00
#SBATCH --job-name myhydro_job
#SBATCH --output=/home/liu4201/jobs/hydro10_4.out    ## replace with your own path
#SBATCH --error=/home/liu4201/jobs/hydro10_4.out     ## replace with your own path

module reset
module load qgis
module load monitor

export QT_QPA_PLATFORM=offscreen

monitor cpu percent --sample-rate 10 --total > cpu.log &
monitor cpu memory --sample-rate 10 --percent > mem.log &

workdir=/scratch/negishi/liu4201/project/code_output  ## replace with your own path

qgis_process run wbt:FillDepressionsWangAndLiu  -- dem=$workdir/merged_dem.tif output=$workdir/fillsinks.tif
qgis_process run wbt:D8Pointer  -- dem=$workdir/fillsinks.tif output=$workdir/flowdirect.tif
qgis_process run wbt:D8FlowAccumulation --distance_units=meters --area_units=m2 --ellipsoid=EPSG:7030 --input=$workdir/flowdirect.tif --out_type=0 --log=false --clip=false --pntr=true --esri_pntr=false --output=$workdir/flowaccum.tif
qgis_process run wbt:ExtractStreams --distance_units=meters --area_units=m2 --ellipsoid=EPSG:7030 --flow_accum=$workdir/flowaccum.tif --threshold=10000 --zero_background=false --output=$workdir/stream10_4.tif
qgis_process run wbt:Subbasins --distance_units=meters --area_units=m2 --ellipsoid=EPSG:7030 --d8_pntr=$workdir/flowdirect.tif --streams=$workdir/stream10_4.tif --esri_pntr=false --output=$workdir/subbasins10_4.tif
qgis_process run wbt:WetnessIndex --distance_units=meters --area_units=m2 --ellipsoid=EPSG:7030 --sca=$workdir/subbasins10_4.tif --slope=$workdir/slope.tif --output=$workdir/wetnessIndex10_4.tif

Step 1: Search the processing in the Processing Toolbox with QGIS interface, and fill in all the parameters.

Step 2: Click “Advanced”, and select “Copy as qgis_process Command”.

Step 3: Paste it to run in the job script.

  • Step 2: Then we could submit the job with typing the code below in Terminal.

SH

sbatch myhydrojob

Callout

monitor is a module to monitor your resources, such as cpu and memory. You can look up for help after loading the module, as below.

To look up help for “monitor”, after “module load monitor”, run

SH

monitor --help

To look up help for “qgis_process”, after “module load qgis”, run

SH

qgis_process --help

Coding with Standalone Python


Now let’s experiment on a different Channelization Threshold, 10,000 by coding with standalone Python Script.

  • Step 1: Start a new python script named “hydro_10_5.py” and copy in the code below. You should replace the path with your own where it is noted.

PYTHON

import os
from qgis.core import *
import sys

# initialize qgis
qgs_app = QgsApplication([], False)
qgs_app.initQgis()

#add whiteboxTools path
module_path = "/home/liu4201/apps/qgis_tools/WhiteboxTools_linux_amd64/WBT" ## replace with your own path
sys.path.append(module_path)

#import processing tools
import whitebox as wbtool
#from qgis import processing
#from processing.core.Processing import Processing

#Processing.initialize()
# change work directory
path ="/scratch/negishi/liu4201/project/code_output/"  ## replace with your own path
os.chdir(path)

#for alg in QgsApplication.processingRegistry().algorithms():
#    print(alg.id(), "->", alg.displayName())

wbt = wbtool.WhiteboxTools()

#processing.run("native:hillshade", {'INPUT':'merged_dem.tif','Z_FACTOR':1,'AZIMUTH':300,'V_ANGLE':40,'OUTPUT':'hillshade.tif'})

wbt.extract_streams(flow_accum = path + "flowdirect.tif",
                   threshold=100000,
                   zero_background=False,
                   output= path + "stream10_5.tif")
wbt.subbasins(d8_pntr= path + "flowdirect.tif", streams= path + "stream10_5.tif",esri_pntr=False, output= path + "subbasins10_5.tif")
wbt.wetness_index(sca= path + "subbasins10_5.tif", slope= path + "slope.tif", output= path + "wetnessIndex10_5.tif")

qgs_app.exitQgis()

Note 1: Refer WhiteBoxTools User Manual for the code to use the algorithm, “Copy as Python Command” provided by QGIS does not match the real WhiteBoxTools algorithms.

Note 2: Refer to the code commented out in the snippet to use processing tools provided by QGIS (for example, native:hillshade). To highlight, it’s necessary to include the code below in the right place.

PYTHON

from qgis import processing
from processing.core.Processing import Processing

Processing.initialize()
  • Step 2: Start a file named “myjob10_5” and copy the code below in. You should replace the path with your own where it is noted in parenthesis.

SH

#!/bin/sh -l
# FILENAME:  myjob10_5

#SBATCH -A workshop
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8 
#SBATCH --time=00:10:00
#SBATCH --job-name myhydro_job
#SBATCH --output=/home/liu4201/jobs/hydro10_5.out  ## replace with your own path
#SBATCH --error=/home/liu4201/jobs/hydro10_5.out   ## replace with your own path

module reset
module load qgis
module load monitor

export QT_QPA_PLATFORM=offscreen

monitor cpu percent --sample-rate 10 --total > cpu1.log &
monitor cpu memory --sample-rate 10 --percent > mem1.log &

python /home/liu4201/qgis_data/code/hydro_10_5.py  ## replace with your own path
  • Step 3: Submit your job via running the code below in terminal.

SH

sbatch myjob

The export QT_QPA_PLATFORM=offscreen command tells Qt (the GUI framework used by QGIS) to use the “offscreen” platform plugin.

Key Points

  • To automate geospatial tasks with QGIS algorithms, we can use three coding approaches: the QGIS Python Console, standalone Python scripts, and command-line execution.
  • For QGIS Python Console, data can be loaded in two ways depending on whether you want to add it to the QGIS map canvas.
  • Standalone Python scripts and command-line execution allow us to submit jobs to HPC clusters, enabling enhanced automation of QGIS geospatial tasks.
  • We monitor CPU and memory usage within jobs to guide our computation resource requests.