Introduction
Last updated on 2025-03-11 | Edit this page
Estimated time: 13 minutes
Overview
Questions
- Why QGIS?
- How to start QGIS on HPC Clusters?
- How to load and visualize data in QGIS?
- How to process and export data in QGIS?
Objectives
- Explain why we should use QGIS
- Demonstrate how to start QGIS on HPC Clusters
- Demonstrate how to load and visualize data in QGIS
- Demonstrate how to process and export data in QGIS
Introduction to QGIS
GIS stands for ‘Geographical Information System’. We can use a GIS application, such as ArcGIS and QGIS to manipulate spatial information.
QGIS is a free and open-source software that runs on various operating systems. It offers a wide range of functionality, such as vector and raster analysis, geoprocessing, geocoding, georeferencing, web mapping, and 3D visualization. QGIS also supports many data formats and standards, such as Shapefile, GeoTIFF, GeoJSON, WMS, WFS, and PostGIS.
Why QGIS? It’s free and flexible.
- Cost-free: Enjoy QGIS without any financial burden. It’s completely free, no hidden fees.
- Free as in “Do It Your Way”: You could extend QGIS to meet your specific needs, sponsor development or contribute your own code.
- Works where you work: Run QGIS on macOS, Windows, and Linux (so available for HPC Clusters).
- Always getting better: Benefit from rapid development because anyone can add new features and improve on existing ones.
- Never get stuck: Access extensive documentation and a large and active supportive community is there for help.
- Easily integrate Artificial Intelligence (AI) and GeoAI.
Open QGIS on Clusters
We can start QGIS via ThinLinc Client or Gateway.
1. Start QGIS via ThinLinc
Follow up with the Setup page, and connect with ThinLinc. There are two ways to start QGIS via ThinLinc. The two ways are fundamentally the same but one is interactive, and the other is typing code.
(1) Interactive Way
To open QGIS as an interactive job, we could go to “Cluster Software” and select “QGIS”, as the figure below:
Then select the “workshop” queue as below:
Then hit “No”:
Now input two cores and five minutes and hit Okay. You don’t need to specifically request memory because memory will be relocated proportional with cores. But if you do, include unit such as “4G”.
2. Start QGIS via Gateway
Gateway, also named Open OnDemand, is a Web interface includes file explorer, interactive apps including QGIS. We have to use our own accounts to login Gateway, not the training accounts we used for this workshop. Go to Negishi Gateway, login with our purdue accounts (when we have account on Clusters) and connect QGIS as the figure below.
Callout
We could also open QGIS with Gateway, in the Typing Code Way. We could start a terminal as below.
View Spatial Data
In Geographic Information Systems (GIS), data is primarily represented in two fundamental formats: vector and raster.
Vector Data
- Representation:
- Vector data uses geometric objects—points, lines, and polygons—to represent spatial features.
- Points represent individual locations (e.g., a city, a tree).
- Lines represent linear features (e.g., roads, rivers).
- Polygons represent areas (e.g., lakes, buildings, administrative boundaries).
- Characteristics:
- Precision: Vector data is excellent for representing discrete features with clear boundaries, offering high precision.
- Scalability: Vector data can be scaled up or down without losing quality.
- Data Storage: Typically, vector data requires less storage space than raster data for representing discrete features.
- Use Cases: Best suited for representing features with distinct boundaries, such as roads, property lines, and political boundaries.
- Vector File Types:
- Shapefile (.shp): A very common geospatial vector data format for GIS software. It actually consists of several files (.shp, .shx, .dbf, etc.)
- GeoJSON (.geojson): A popular open standard format that uses JavaScript Object Notation (JSON) to represent geographic features.
- KML/KMZ: Used by Google Earth for displaying geographic data.
- File format is handled by GDAL/OGR package with a full list
Raster Data
- Representation:
- Raster data represents spatial information as a grid of cells (pixels). Each cell contains a value representing a specific attribute (e.g., elevation, temperature, land cover).
- Characteristics:
- Continuous Data: Raster data is ideal for representing continuous attributes, such as elevation, temperature, and satellite imagery.
- Data Storage: Raster data can require significant storage space, especially at high resolutions.
- Analysis: Raster data is well-suited for spatial analysis involving calculations and modeling.
- Use Cases: Best suited for representing continuous surfaces, such as elevation models, satellite imagery, and aerial photographs.
- Raster File Types:
- TIFF: Basic image format, no geographic information.
- GeoTIFF: A TIFF file with added geospatial metadata, enabling it to be used in GIS applications.
- COG (Cloud Optimized GeoTIFF): A type of GeoTIFF with a specific data structure optimized for fast access in cloud environments, often using tiled data storage.
- File format is handled by GDAL/OGR package with a full list
Key Differences
- Structure: Vector data uses geometric shapes, while raster data uses a grid of cells.
- Data Type: Vector is for discrete features, raster is for continuous phenomena.
- Precision: Vector is generally more precise, while raster’s precision depends on cell size.
- Storage: Vector often uses less storage for discrete features. Raster data storage size is heavily dependant on resolution.
Load Spatial Data
(1) Load vector data from files
- Step1: Layer -> Add Layer -> Add Vector Layer
- Step2: Input your path of “alaska.shp” and hit “add”
- Step 3: You will see the shapefile has been added to Layers as below.
Challenge 1: Try yourself
try yourself to add airports.shp to layers.
Callout
When adding a data source, QGIS attempts to identify its Coordinate Reference System (CRS) from sources like a shapefile’s .prj file. If no CRS information is found, QGIS prompts you to specify it. You can modify this behavior in Settings -> Options -> CRS to automatically assign either the project’s CRS or a designated default CRS. (Graser et al., 2017)
(2) Load CSV files
- Step 1: Layer -> Add Layer -> Add Delimited Text Layer
- Step 2: Make changes as the red box in the picture below.
- Step 3: You will see the shapefile has been added to Layers as below.
(3) Load Raster files
- Step1: Layer -> Add Layer -> Add Raster Layer
- Step2: Input your path of “landcover.img” and hit “add”
- Step 3: You will see the raster has been added to Layers as below.
Challenge 1: Try yourself
try yourself to add SR_50M_alaska_nad.tif (Hillshade GeoTiff) to layers.
Challenge 2: A Question?
Did you find anything weird about the Hillshade image showing above?
Yep, the hillshade should be in the North instead of the South. Let’s check data Properties and change looking.
Process Spatial Data
Filter Vector Data
Scenario: My Grandma wants to have a trip in Alaska but doctor said she shouldn’t go to places with high elevation due to the heart problem. So I will find airports with low elevantion for her. For example, I found the airport with elevation lower than 1000 ft.
Solution:
- Step1: Right click the data “airports” and select “Filter”
- Step2: Select “ELEV” from “Fields”, and input the Specific Filter Expression as below:
- Step3: Hit “OK” and now only airport with elevation lower than 1000 ft show up.
- Step4-Export data: right click the data and selelct “Export” -> “Save Features As”. Input information as figure below and hit “OK”.
Raster Calculation
Scenario: I’d like find some sunny slope where my Grandma and I can go skiing. So I found the places where Hillshade is smaller than 100, for example.
Solution:
- Step1: Turn on the Processing Toolbox if it’s off, and Search “Raster Calculator”.
- Step2: Input the “Input Layers”, “Expression”, “Output CRS”, and “Calculated” as the Output file.
- Step3: Hit “Run” and change the looking of output.
- You have already written out the output file. But if you didn’t, you could always export the data via right clicking it and select “Export” -> “Save As”. Then input information as figure below and hit “OK”.
Key Points
- QGIS is a free and open-source software that runs on various platforms, such as Windows, Mac, and Linux.
- QGIS has a large and active community of users and developers who contribute to its features, plugins, documentation, and support.
- We can use QGIS via ThinLinc Client or Gateway on HPC Clusters.
- We learned how to load and visualize vector and raster data.
- We learned how to process data and export them.