RTOpacks Data Sources¶
This is the authoritative reference for every data source that feeds rtopacks-db. It covers what the data is, where it comes from, how it was ingested, which tables it populates, current row counts, and what the update cadence and method should be.
Last audited: 26 March 2026
Database: rtopacks-db (D1: 334ac8fb-9850-48c0-9da0-b56c55640e98)
Cloudflare account: e5a9830215a8d88961dc6c80a8c7442a
Provenance table: data_sources (all sources should be registered here)
Quick Reference¶
| # | Source | Tables | Rows | Registered | Refresh |
|---|---|---|---|---|---|
| 1 | National Training Register (TGA) | qualifications, units, qualification_units, rtos, rto_scope + 5 more | ~490,000 | ✅ | Quarterly |
| 2 | yourcareer.gov.au — Search API | qualifications (enrichment columns) | 7,054 quals | ✅ | Annual |
| 3 | yourcareer.gov.au — Pathways API | qual_career_pathways, qualification_specialisations | 54,640 + 967 | ✅ | Annual |
| 4 | CRICOS | cricos_providers/courses/institutions/locations/course_locations | 79,656 total | ✅ | Monthly |
| 5 | NCVER VOCSTATS | vocstats_* (8 tables) | ~9,200 total | ⚠️ Not registered | Annual |
| 6 | JSA VNDA — scrape | recon_vnda, recon_vnda_aqf, recon_vnda_foe | 513 total | ⚠️ Not registered | When JSA updates |
| 7 | JSA VNDA — export (deprecated) | vnda_atlas | 785 | ⚠️ Not registered | DEPRECATED |
| 8 | JSA OSL — Occupation Shortage List | osl_ratings | 9,454 | ⚠️ Not registered | Annual |
| 9 | JSA IVI — Internet Vacancy Index | ivi_vacancies | 37,908 | ⚠️ Not registered | Monthly |
| 10 | JSA Employment Projections | emp_projections | 358 | ⚠️ Not registered | Annual |
| 11 | JSA GLMD — Regional Labour Market | glmd_regional | 96 | ⚠️ Not registered | Monthly |
| 12 | JSA Training API — Per-qual enrolments | jsa_qual_training (new) | 0 — Brief #25 | ⚠️ Not registered | Annual July |
| 13 | NSW Smart and Skilled | state_funding (NSW rows) | 1,164 | ✅ | Annual |
| 14 | QLD Career Start / QSTL | state_funding (QLD rows) | 459 | ✅ | Annual |
| 15 | VIC Free TAFE | state_funding (VIC rows) | 7 ⚠️ incomplete | ⚠️ Not registered | Annual |
| 16 | SA WorkReady | state_funding (SA rows) | 720 | ✅ | Annual |
| 17 | WA Jobs and Skills WA | state_funding (WA rows) | 175 | ✅ | Annual |
| 18 | TAS Skills Tasmania | state_funding (TAS rows) | 420 | ✅ | Annual |
| 19 | NT Fee-Free TAFE / CDU | state_funding (NT rows) | 95 | ✅ | Annual |
| 20 | ACT Skills | state_funding (ACT rows) | 0 — not yet ingested | ❌ | — |
| 21 | ABR Reference Codes | abr_codes | 167 | ⚠️ Not registered | Rarely |
| 22 | Unit Stats (derived) | units (computed columns) | 15,200 | ✅ | After TGA ingest |
Sources marked ⚠️ will be registered in data_sources as part of Brief #25.
1. National Training Register (TGA)¶
data_sources key: tga_corpus
Authority: Department of Employment and Workplace Relations
URL: https://training.gov.au
Licence: Creative Commons Attribution 4.0
Auth: None required
Last ingested: Not recorded in data_sources — TGA ingest predates the provenance registry
What it is¶
The authoritative Australian register of all vocational education and training. Contains every qualification, unit of competency, skill set, and accredited course across all statuses (current, superseded, expired), plus every registered RTO and their approved delivery scope. This is the foundational dataset — everything else in the database enriches on top of it.
How it was ingested¶
TGA adapter ingest via the UCCA engine's adapters/tga/ module. The engine processes TGA's public data endpoints, applies the TGA adapter, and populates the corpus tables in rtopacks-db. This is a world-level ingest — the AU-VET world's foundational data.
Tables populated¶
| Table | Rows | Content |
|---|---|---|
qualifications |
8,007 | All qualifications — code, title, AQF level, training package, packaging rules, supersession chain (supersedes, superseded_by), status, entry requirements, description |
units |
75,189 | All units of competency — code, title, description, elements, performance criteria, knowledge evidence, performance evidence, assessment conditions, foundation skills, training package, status |
qualification_units |
244,874 | Junction table — which units belong to which qualifications, core vs elective flags |
rtos |
12,515 | All registered training organisations — code, legal name, ABN, status, type |
rto_scope |
150,000 | RTO delivery scope — which RTOs are approved to deliver which qualifications, in which states |
rto_addresses |
1,064 | RTO physical and postal addresses |
rto_contacts |
2,176 | RTO contact details (email, phone, fax) |
rto_legal_names |
139 | RTO legal name history |
rto_trading_names |
137 | RTO trading names |
rto_registrations |
267 | RTO registration periods, regulators (ASQA / state) |
rto_web_addresses |
~137 | RTO website URLs |
rto_classifications |
~267 | RTO type classifications |
Update cadence and method¶
TGA updates continuously as training packages are endorsed, superseded, and as RTOs register/deregister. Re-run the TGA adapter ingest quarterly, or when a major training package update is announced by a relevant Industry Reference Committee (IRC). After each TGA ingest, also re-run the unit_stats aggregate (source #22).
2. yourcareer.gov.au — Qualification Search API¶
data_sources key: yourcareer_search
Authority: Jobs and Skills Australia / Department of Employment and Workplace Relations
API URL: https://api.yourcareer.gov.au/api/Courses/search
Auth: None required
Last ingested: 2026-03-21 | Records: 7,054
Ingest script: B-ENRICH-YOURCAREER-01
What it is¶
The yourcareer.gov.au platform (the successor to myfuture.edu.au) exposes a public unauthenticated API returning financial and employment metadata per qualification — student fees, VSL (VET Student Loans) eligibility, employment outcome percentages, typical duration, and which RTOs deliver each course.
How it was ingested¶
Paginated GET requests to the search endpoint, pageSize=100, iterated until all qualifications returned. No auth required. Results written as enrichment columns directly onto the qualifications table.
Fields written to qualifications¶
| Column | Description |
|---|---|
yc_fee_min / yc_fee_max |
Student fee range ($) |
yc_vsl_eligible |
VET Student Loan eligible flag (1/0) |
yc_has_subsidies |
Government subsidy available flag |
yc_apprenticeship_states |
States where apprenticeship pathway is available |
yc_is_apprenticeship |
Is an apprenticeship pathway flag |
yc_offered_online |
Online delivery available flag |
yc_employed_pct |
Employment % reported after completion |
yc_rto_codes |
Comma-separated list of delivering RTO codes |
yc_industry_codes |
Industry classification codes |
yc_occupation_codes |
ANZSCO occupation codes |
yc_duration |
Typical course duration |
yc_has_rto_offerings |
Whether any RTOs are currently delivering |
yc_superseded_by |
Supersession reference from yourcareer |
yc_date_modified |
Last modification date in yourcareer |
yc_enriched_at |
Timestamp of enrichment write |
Derived fields (computed by B-MODEL-01 from this data):
| Column | Description |
|---|---|
market_score |
0–100 market viability score per qual (BSB50420 tops at 99) |
vsl_value_ratio |
VSL loan efficiency signal (loan amount vs income uplift) |
vsl_total_cost |
Total VSL loan cost for the qualification |
Update cadence and method¶
Annual. Fee and VSL eligibility data changes with each government budget cycle (typically May/June). Re-run the yourcareer ingest script against the same paginated API. Upsert on qual_code.
3. yourcareer.gov.au — Career Pathways API¶
data_sources key: yourcareer_pathways
API URL: https://api.yourcareer.gov.au/api/Courses/{code}/PathwaysIndustry
Auth: None required
Last ingested: 2026-03-21 | Records: 54,639
What it is¶
Per-qualification career pathway data from yourcareer.gov.au — job title progressions and career ladders by industry stream across AQF levels. This powers the Career Ladder View in RTOpacks.
How it was ingested¶
One API call per qualification code, using each qual_code from the qualifications table as the {code} parameter. Results written to separate tables.
Tables populated¶
| Table | Rows | Content |
|---|---|---|
qual_career_pathways |
54,640 | Job title progression per qualification per industry stream — entry, mid, senior roles |
qualification_specialisations |
967 | Specialisation streams available within each qualification |
Update cadence and method¶
Annual, same cycle as yourcareer_search. Re-run the pathways ingest per-qual after yourcareer_search completes. Upsert on qual_code + pathway identifier.
4. CRICOS — Commonwealth Register of Institutions and Courses for Overseas Students¶
data_sources key: cricos
Authority: Department of Education
Source URL: https://cricos.education.gov.au
Download URL: https://data.gov.au/data/dataset/e5ae7059-bfa8-4fa4-a5c0-c13cf3520193/resource/63fd9610-5bea-438c-bac7-29289d38cfbb/download/cricos-providers-courses-locations.zip
Licence: Creative Commons Attribution 2.5 Australia
Snapshot date: 2 March 2026
What it is¶
The official register of all Australian institutions and courses approved for international (overseas) student enrolment. Contains CRICOS provider codes, course codes, international tuition fees, course durations, and delivery locations. Used to identify which qualifications attract international students and at what fee benchmarks.
How it was ingested¶
ZIP file downloaded from data.gov.au (the ZIP URL above is stable and updated regularly). Unzipped to four CSVs, each loaded directly to its target table. The ZIP also contains a PDF purpose statement — ignore it.
Tables populated¶
| Table | Rows | Content |
|---|---|---|
cricos_providers |
1,542 | Providers registered with CRICOS — provider code, name, status |
cricos_institutions |
1,552 | Institution details — ABN, type, trading names |
cricos_courses |
26,172 | Courses approved for international delivery — course code, tuition fee, duration, CRICOS course ID |
cricos_locations |
3,907 | Physical delivery locations per provider |
cricos_course_locations |
46,483 | Course-location junction — which courses are delivered at which locations |
Enrichment written to qualifications: cricos_provider_count and cricos_avg_tuition_aud (number of CRICOS-registered providers and average international tuition fee for each qual code).
Update cadence and method¶
Monthly. CRICOS data changes regularly as providers register/deregister courses and update fees. The data.gov.au ZIP URL is stable — simply re-download and re-ingest monthly.
Download: https://data.gov.au/data/dataset/e5ae7059-bfa8-4fa4-a5c0-c13cf3520193/resource/63fd9610-5bea-438c-bac7-29289d38cfbb/download/cricos-providers-courses-locations.zip
5. NCVER VOCSTATS — Training Activity Data¶
data_sources key: ncver_vocstats ⚠️ Not yet registered
Authority: National Centre for Vocational Education Research (NCVER)
URL: https://www.ncver.edu.au/research-and-statistics/vocstats
Auth: Free NCVER registration required
Ingested: 2026-03-20 — manual session
What it is¶
NCVER's VOCSTATS tool provides access to the Total VET Activity (TVA) database — the national mandatory collection of all VET enrolments and completions reported by RTOs under AVETMISS. Also includes the VET Student Outcomes Survey (SOS) and Apprentices and Trainees collection. These are aggregate tables only — data is by Field of Education (FOE) or AQF level, not by individual qualification. Per-qual volumes come from the JSA API (source #12).
Source database keys:
- TVA enrolments/completions: tva_prg_1524_ext_nvetr_rel24
- SOS outcomes: SOS_tva_1625_ext_rel25
How it was ingested¶
Manual session using the VOCSTATS web query tool at ncver.edu.au. Queries were built with selected dimensions, downloaded as .xlsx + .json pairs. All files were wide format (years as column headers, rows 1–8 are metadata — skip on ingest). Alex ingested using the xlsx npm package. Original files are timestamped by download time.
Tables populated¶
| Table | Rows | Dimensions | Source collection | File timestamp |
|---|---|---|---|---|
vocstats_enrolments_foe |
130 | FOE × Year (2015–2024) | TVA enrolments | table_2026-03-20_18-20-45.xlsx |
vocstats_completions_foe |
130 | FOE × Year (2015–2024) | TVA completions | table_2026-03-20_18-27-37.xlsx |
vocstats_enrolments_apprentice |
1,459 | Apprentice/trainee × FOE × Year | TVA enrolments | table_2026-03-20_18-55-58.xlsx |
vocstats_enrolments_funding |
5,791 | Funding source × FOE × Year | TVA enrolments | table_2026-03-20_19-00-10.xlsx |
vocstats_enrolments_international |
1,160 | International/domestic × 4-digit FOE × Year | TVA enrolments | table_2026-03-20_18-36-44.xlsx + table_2026-03-20_19-08-22.xlsx |
vocstats_enrolments_vis |
190 | VET in Schools: provider type × school status × training type × year left school × Year | TVA enrolments | table_2026-03-20_18-51-43.xlsx |
vocstats_outcomes |
129 | Employed/further study × FOE × Year (2016–2025) | SOS outcomes | table_2026-03-20_19-12-14.xlsx |
vocstats_outcomes_aqf |
90 | Employed/further study × AQF level × Year (2016–2025) | SOS outcomes | table_2026-03-20_17-44-29.xlsx |
Update cadence and method¶
Annual. NCVER releases TVA data around June each year (covering the previous calendar year). SOS outcomes are also released annually.
Currently manual — Tim re-runs the VOCSTATS query session and downloads new files, Alex re-ingests. No automated path exists yet. Per-qual enrolment/completion volumes (a separate gap) are addressed by Brief #25 via the JSA API (source #12), which is automatable.
6. JSA VNDA — VET National Data Asset (Graduate Outcomes)¶
data_sources key: jsa_vnda_scrape ⚠️ Not yet registered
Authority: Jobs and Skills Australia, ABS, NCVER (integrated dataset)
Source URL: https://www.jobsandskills.gov.au/data/vocational-education-training/vet-national-data-asset
API URL: https://www.jobsandskills.gov.au/api/v1/opensearch/vnda/_search
Auth: None required
Ingested: 2026-03-19 (Python scraper vnda_scraper.py, then vnda_ingest.py)
What it is¶
VNDA is an integrated data asset linking NCVER VET activity records with ATO tax data, Department of Social Services income support records, and Department of Education data. It provides the most accurate picture of what happens to VET graduates after completion — employment rates, income uplift, income support exit rates, and further study progression. Currently covers the top ~500 courses by completion volume, with outcomes for the FY2019-20 completion cohort reported in FY2020-21.
Critical limitations:
- Covers only the top ~500 courses — 494 of 1,167 current quals (42%)
- Data is frozen at FY2020-21 as of March 2026 — no update has been published since
- The series_period field must always be displayed alongside any VNDA-derived figure in the UI
How it was ingested¶
Direct scrape of the JSA VNDA OpenSearch API (vnda/_search index). Python script vnda_scraper.py submitted a single POST query for FY2020-21 with size: 10000, returned all 494 records in one request. Ingested to D1 via vnda_ingest.py. A companion scrape also pulled aggregate views (by AQF level, by FOE) which populate the _aqf and _foe sibling tables.
Tables populated¶
| Table | Rows | Content | Status |
|---|---|---|---|
recon_vnda |
494 | Per-qual outcomes + student characteristics JSON | CANONICAL — use this |
recon_vnda_aqf |
7 | Outcomes aggregated by AQF level | Canonical |
recon_vnda_foe |
12 | Outcomes aggregated by Field of Education | Canonical |
recon_vnda_state |
0 | State-level outcomes — empty, not yet populated | — |
Fields per qualification in recon_vnda¶
| Field | Description |
|---|---|
qual_id |
Qualification code |
employed_pct |
Employment rate post-completion (e.g. 0.76 = 76%) |
employment_change_pct |
Change in employment rate after completing vs before |
median_income |
Median annual income post-completion ($) |
median_income_change |
Income uplift after completion ($) |
income_support_exit_rate |
% who exited income support after completing |
higher_vet_progression |
% who progressed to a higher-level VET qualification |
any_vet_progression |
% who did any further VET study |
student_characteristics |
JSON: pctFemale, pctDisability, pctFirstNations, pctApprenticeTrainees, pctNoYr12NoCert3, medianCompletionAgeYrs, medianCompletionTimeDays, pctRemote, pctRegional, pctMajorCity |
series_period |
Financial year of data (currently FY2020-21) |
vnda_source_code |
Actual qual code that returned data — may differ from qual_id if supersession walk used (Brief #25) |
vnda_match_type |
Match method: direct, supersedes, superseded_by, sibling, none (Brief #25) |
Update cadence and method¶
When JSA publishes a new VNDA report (no fixed schedule — FY2020-21 has been the only release since 2022). Brief #25 jsa-ingest Worker includes a November cron that detects and ingests new data when it arrives. Brief #25 also adds a supersession walk to push coverage from 42% to ~70%+.
7. JSA VNDA — Export (DEPRECATED)¶
data_sources key: jsa_vnda_export ⚠️ Not yet registered
Status: DEPRECATED — do not use
Ingested: 2026-03-19 | Rows: 785
Why it is deprecated¶
A three-way comparison between vnda_atlas, recon_vnda, and the live JSA API for BSB30120 shows vnda_atlas values are incorrect:
| Field | vnda_atlas |
recon_vnda |
JSA API |
|---|---|---|---|
| Employment rate | 0.69 ❌ | 0.76 ✓ | 0.76 |
| Median income | $36,300 ❌ | $38,700 ✓ | $38,700 |
| Income support exit | 0.26 ❌ | 0.28 ✓ | 0.28 |
The source of vnda_atlas is unknown — likely a Power BI export or an earlier Atlas data download that used different rounding or a different data vintage. The table is retained for audit purposes only and will be flagged as deprecated by Brief #25. Do not reference vnda_atlas in any new code.
8. JSA Occupation Shortage List (OSL)¶
data_sources key: jsa_osl ⚠️ Not yet registered
Authority: Jobs and Skills Australia
URL: https://www.jobsandskills.gov.au/data/occupation-shortage-list
Auth: None required
Ingested: 2026-03-19 | Rows: 9,454
What it is¶
Annual assessment of shortage status for occupations across Australia, by state/territory and nationally. Formerly known as the Skills Priority List (SPL). Ratings are based on data modelling, statistical analysis, stakeholder consultation, and employer surveys. Updated annually by JSA.
Shortage ratings:
| Code | Meaning |
|---|---|
S |
Shortage — employers have considerable difficulty filling vacancies nationally |
R |
Regional shortage only |
M |
Metro shortage only |
NS |
No Shortage — no significant evidence of difficulty filling vacancies |
Tables populated¶
| Table | Rows | Content |
|---|---|---|
osl_ratings |
9,454 | ANZSCO code × ANZSCO version × state × year — shortage rating, skill level, alt titles, shortage driver |
Columns available: anzsco_code, anzsco_ver, occ_title, skill_level, alt_titles, year, rnat (national), rnsw, rvic, rqld, rsa, rwa, rtas, rnt, ract (state ratings), driver
Columns to be added by Brief #25:
- shortage_driver — long_training_gap / short_training_gap / suitability_gap / retention_gap / uncertain
- is_clean_energy_critical — boolean, 1 for the 38 JSA Clean Energy Capacity Study critical occupations
Update cadence and method¶
Annual. Brief #25 jsa-ingest Worker — monthly lightweight run checks for OSL updates (cheap to re-fetch even though the data only changes annually).
9. JSA Internet Vacancy Index (IVI)¶
data_sources key: jsa_ivi ⚠️ Not yet registered
Authority: Jobs and Skills Australia
URL: https://www.jobsandskills.gov.au/data/internet-vacancy-index
Auth: None required
Ingested: 2026-03-19 | Rows: 37,908
What it is¶
Monthly count of newly lodged online job advertisements by ANZSCO occupation, state, and region. Sourced from SEEK, CareerOne, and Workforce Australia job boards. Released on the third Wednesday of each month. Provides a demand signal for occupations — useful as a leading indicator alongside OSL shortage ratings.
Important caveats: - Counts new ads posted during the reference month — not total open vacancies - Biased toward higher-skilled positions (lower-skilled roles use informal recruitment) - SA4-level data is experimental - IVI data is based on place of work; NERO employment data is based on place of residence — these are not strictly comparable
Tables populated¶
| Table | Rows | Content |
|---|---|---|
ivi_vacancies |
37,908 | ANZSCO code × month × state — vacancy ad counts |
Update cadence and method¶
Monthly. Brief #25 jsa-ingest Worker — monthly cron 0 2 1 * *. The IVI download file URL includes the release month in its filename. The Worker must either scrape the IVI download page for the current URL or construct it from the known URL pattern:
https://www.jobsandskills.gov.au/system/files/YYYY-MM/
Internet%20Vacancies%2C%20ANZSCO2%20Occupations%2C%20States%20and%20Territories%20-%20{Month}%20{Year}.xls
10. JSA Employment Projections¶
data_sources key: jsa_emp_projections ⚠️ Not yet registered
Authority: Jobs and Skills Australia / Victoria University (VUEF model)
URL: https://www.jobsandskills.gov.au/data/employment-projections
Auth: None required
Ingested: 2026-03-19 | Rows: 358
What it is¶
National employment projections to May 2034 by occupation and industry, produced by Victoria University using the Victoria University Employment Forecasting (VUEF) model, calibrated against Australian Treasury macroeconomic forecasts. Available at 5-year and 10-year horizons. These are forward-looking demand signals — useful for showing whether an occupation is projected to grow or decline.
Tables populated¶
| Table | Rows | Content |
|---|---|---|
emp_projections |
358 | ANZSCO code × base year × state — 5-year and 10-year projected employment change (absolute headcount and %) |
Update cadence and method¶
Annual. JSA publishes updated projections yearly. Brief #25 jsa-ingest Worker checks for updates on the monthly run.
11. JSA General Labour Market Data (GLMD) — Regional¶
data_sources key: jsa_glmd ⚠️ Not yet registered
Authority: Jobs and Skills Australia
File URL pattern: https://www.jobsandskills.gov.au/system/files/datasets/glmd (YYYY-MM).json
Auth: None required
Ingested: 2026-03-19 | Rows: 96
What it is¶
Monthly static JSON file containing general labour market indicators by SA4 region and state: employment rate, unemployment rate, participation rate, population, youth unemployment, and the Regional Labour Market Indicator (RLMI — a composite performance score rating regions as Strong / Above average / Average / Below average / Poor).
Tables populated¶
| Table | Rows | Content |
|---|---|---|
glmd_regional |
96 | Region code × data date — employment rate, unemployment, participation, population, RLMI value and label |
Update cadence and method¶
Monthly. The file URL includes the release month — the Worker fetches the latest file by constructing the URL from the current date. Brief #25 jsa-ingest Worker — monthly cron.
12. JSA Training API — Per-Qualification Enrolments and Completions¶
data_sources key: jsa_training_api ⚠️ Not yet registered
Authority: Jobs and Skills Australia (sourced from NCVER TVA)
API URL: https://www.jobsandskills.gov.au/api/v1/opensearch/training/_search
Auth: None required
Status: Not yet ingested — Brief #25 adds this
What it is¶
The JSA Atlas OpenSearch API's training index contains per-qualification enrolment and completion volumes sourced from NCVER TVA — the data that VOCSTATS provides only at FOE-level aggregate. This gives us individual qual-level figures: e.g. BSB30120 Certificate III in Business: 67,112 enrolments in 2024, 22,192 completions. Also includes FOE classification (linking quals to the broader Field of Education taxonomy) and demographic segmentation (gender, Indigenous status, disability status per qual).
Confirmed live values for BSB30120 (2024): - Enrolments: 67,112 (−10.1% YoY) - Completions: 22,192 (−3.1% YoY) - FOE code: 0809 (Office studies) - Gender split: 63% Female, 36.5% Male
Tables to be created by Brief #25¶
| Table | Content |
|---|---|
jsa_qual_training |
Per-qual: enrolments, completions, YoY trends, FOE code/name, AQF level — national only initially |
jsa_qual_training_segments |
Demographic breakdown per qual, open-ended key/value design — gender, Indigenous, disability |
⚠️ Critical display rule: Enrolments and completions must never be divided to produce a "completion rate" — they represent different student cohorts in different years. JSA explicitly warns against this comparison.
Update cadence and method¶
Annual — NCVER TVA releases around June each year. Brief #25 jsa-ingest Worker — vet run, cron 0 2 1 7 * (1 July annually). Rate limit: 1 req/sec to JSA API.
13–19. State Government Funding Lists¶
All state data is stored in the single state_funding table (3,040 rows total). Fields: qual_code, state, funded, free, free_apprenticeship, pathway, program_name, co_contribution, conditions, source_version, ingested_at.
NSW — Smart and Skilled¶
data_sources key: ssl_nsw ✅
URL: https://www.nsw.gov.au/education-and-training/vocational/nsw-skills-list
Method: Excel download — Skills List v16.2, xlsx parse
Rows: 1,164 | Last ingested: 2026-03-24
Version: NSW Skills List v16.2 @ 01/01/2026
Notes: Three pathways (Traineeship, Apprenticeship, General Training). 149 quals flagged as fee-free (NFF). Largest state dataset by row count.
QLD — Career Start / Queensland Subsidised Training List (QSTL)¶
data_sources key: qstl_qld ✅
URL: https://dtet.qld.gov.au/training/providers/funded/subsidised-training-list
Method: PDF extract via pdftotext and regex — QSTL v4 Feb 2026
Rows: 459 | Last ingested: 2026-03-24
Notes: Includes max completion payable to RTO — useful for margin intelligence. Available from the QLD Publications portal.
VIC — Free TAFE / Skills First¶
data_sources key: funded_vic ⚠️ Not yet registered
URL: https://www.vic.gov.au/free-tafe-courses
Rows: 7 ⚠️ SIGNIFICANTLY INCOMPLETE
Last ingested: 2026-03-24
Notes: Only 7 Free TAFE rows ingested. The full Victorian Skills First Training Needs List (Excel download from vic.gov.au) contains hundreds of quals with subsidy rates and should be re-ingested. This is a known gap. The source_version for current rows is VIC Free TAFE 2026.
SA — WorkReady¶
data_sources key: stl_sa ✅
URL: https://providers.skills.sa.gov.au/subsidised-training-list
Method: PDF parse via pdftotext and regex — STAL + TPL components
Rows: 720 (299 WorkReady + 421 WorkReady Apprenticeship) | Last ingested: 2026-03-24
Version: SA STL v11.2
Notes: Two components — STAL (training contracts) and TPL (general training priority list). PDF parsing may break if SA changes their STL format.
WA — Jobs and Skills WA¶
data_sources key: funded_wa ✅
URL: https://www.jobsandskills.wa.gov.au/course-list
Method: Puppeteer scrape of Drupal AJAX course list, then title-matched to qualifications table
Rows: 175 | Last ingested: 2026-03-24
Notes: 58 fee-free, 121 low-fee. 182 WA courses could not be matched (mostly skill sets not in our qual table). Jobs and Skills WA program.
TAS — Skills Tasmania¶
data_sources key: funded_tas ✅
URL: https://www.skills.tas.gov.au/providers/rto/courses_approved_and_funded_in_tasmania
Method: Excel download — two separate files (non-apprenticeship and apprenticeship/traineeship)
Rows: 420 | Last ingested: 2026-03-24
Notes: Excel file URLs include a date in the filename — check the TAS page for current URLs at each refresh. Previous URLs:
- Non-app: 19-RTOs-funded-to-deliver-qualifications-as-non-apprenticeship-and-traineeship-as-at-20-March-2026.xlsx
- App: 21-RTOs-funded-to-deliver-qualificatinos-as-an-apprenticeship-or-traineeship-as-at-20-March-2026.xlsx
NT — Fee-Free TAFE / User Choice (Charles Darwin University)¶
data_sources key: funded_nt ✅
URL: https://www.cdu.edu.au/courses?type=vet
Method: CDU course page scrape — qual code regex extraction from rendered HTML
Rows: 95 | Last ingested: 2026-03-24
Notes: NT data sourced from CDU VET catalogue only. CDU is the primary NT TAFE provider but not the only one — Batchelor Institute also delivers. Fee-free status not individually confirmed per qual. Marked as potentially incomplete.
ACT — ACT Skills¶
data_sources key: Not registered
Status: Not yet ingested
Notes: ACT Skills Fund approved training list is published but has not been ingested. Lower priority given market size. Should be added in the next state funding refresh cycle.
Update cadence and method for all state lists¶
Annual — most state lists update January–March following budget cycles. NSW and VIC publish mid-year updates. Each state is a different format and ingest method:
| State | Format | Automation potential |
|---|---|---|
| NSW | Excel (stable URL) | High — URL predictable |
| QLD | PDF (URL changes with version) | Medium — requires PDF parser |
| VIC | Excel (stable URL) | High — URL predictable, but full list not yet wired |
| SA | PDF (URL changes with version) | Medium — requires PDF parser |
| WA | Drupal AJAX scrape | Low — requires Puppeteer |
| TAS | Excel (URL includes date) | Medium — URL pattern predictable |
| NT | HTML scrape (CDU) | Low — rendered HTML, may break |
| ACT | Not yet implemented | — |
20. ABR Reference Codes¶
data_sources key: abr_reference ⚠️ Not yet registered
Authority: Australian Taxation Office / Australian Business Register
Rows: 167
What it is¶
Reference lookup table for ABR entity type codes (e.g. PRV = Australian Private Company, PUB = Australian Public Company) and industry classification codes used in ABN records. Used to interpret entity type fields on RTO records.
Update cadence¶
Infrequent — ABR classification codes change rarely. Re-ingest only when ATO publishes updated classification standards.
21. Unit Stats (Derived Aggregates)¶
data_sources key: unit_stats ✅
Not an external source — computed from qualification_units and rto_scope within the database
Last computed: 2026-03-24 | Records: 15,200
What it is¶
Pre-computed aggregate statistics written back as columns on the units table, making unit-level queries fast without joins:
| Column | Description |
|---|---|
stat_qual_count |
How many qualifications include this unit |
stat_core_count |
How many qualifications include this unit as a core unit |
stat_elective_count |
How many qualifications include this unit as an elective |
stat_rto_count |
How many RTOs deliver qualifications that include this unit |
Update method¶
Re-run after every TGA corpus ingest. Uses batch UPDATE via indexed temp tables. Script is part of the TGA adapter ingest pipeline.
Known Gaps and Issues¶
| Issue | Impact | Priority |
|---|---|---|
| VIC Skills First full list not ingested — only 7 rows | VIC funding intelligence severely incomplete | High |
| ACT not ingested | No ACT state_funding rows | Medium |
| VNDA coverage at 42% (494/1,167 current quals) | Employment/income data missing for 58% of quals | High — Brief #25 addressing |
| Per-qual enrolment/completion volumes missing | No individual qual market size data | High — Brief #25 adding |
9 sources not registered in data_sources |
Provenance gap — no refresh tracking | High — Brief #25 Part 1 fixing |
vnda_atlas contains incorrect values |
Risk of bad data reaching UI | High — Brief #25 Part 2 deprecating |
| No automated refresh for any data | All current data is one-time seeds from March 2026 | High — Brief #25 Part 4 adding cron Worker |
| NCVER VOCSTATS requires manual download session | Cannot be automated without NCVER API access | Medium |
| NT data incomplete (CDU only) | Batchelor Institute quals missing | Low |
Source Provenance: Unregistered Tables¶
The following tables exist in rtopacks-db with data but no entry in data_sources. Brief #25 Part 1 registers all of these.
| Table | Source to register | source_key |
|---|---|---|
recon_vnda, recon_vnda_aqf, recon_vnda_foe |
JSA VNDA OpenSearch API scrape | jsa_vnda_scrape |
vnda_atlas |
Unknown export — deprecated | jsa_vnda_export |
osl_ratings |
JSA Occupation Shortage List | jsa_osl |
ivi_vacancies |
JSA Internet Vacancy Index | jsa_ivi |
emp_projections |
JSA Employment Projections | jsa_emp_projections |
glmd_regional |
JSA GLMD regional JSON | jsa_glmd |
state_funding (VIC rows) |
VIC Free TAFE page | funded_vic |
abr_codes |
ATO ABR reference | abr_reference |
All vocstats_* tables |
NCVER VOCSTATS | ncver_vocstats |
jsa_qual_training (new — Brief #25) |
JSA Training OpenSearch API | jsa_training_api |