Search Shortcut cmd + k | ctrl + k
title_mapper

Efficiently standardizes scraped job titles to Bureau of Labor Statistics (BLS) titles using a high-performance TF-IDF algorithm.

Maintainer(s): martin-conur

Installing and Loading

INSTALL title_mapper FROM community;
LOAD title_mapper;

Example

-- Standardize a column
SELECT standardize_title(scraped_title_column) FROM your_table;

-- Standardize tech job titles
SELECT standardize_title('Sr. Software Eng') AS standardized_title;
-- Result: 'Software Engineer - Software Developers'

-- Standardize healthcare titles
SELECT standardize_title('RN - Emergency Room') AS standardized_title;
-- Result: 'Registered Nurse - Registered Nurses'

About title_mapper

DuckDB Title Mapper

duckdb-title-mapper is a highly optimized DuckDB extension written in Rust. It standardizes scraped job titles to BLS (Bureau of Labor Statistics) standard titles using a fast TF-IDF implementation.

What It Does

This extension transforms messy, inconsistent job titles from various sources into standardized BLS titles:

Scraped Title (Input) Standardized Title (Output)
Sr. Software Eng Software Engineer
Registered Nurse - ICU Registered Nurse
Accountant III Accountant
Sales Rep (B2B) Sales Representative
Elementary School Teacher - 3rd Grade Elementary School Teacher
Exec. Chef Executive Chef
Marketing Coordinator/Specialist Marketing Specialist
Licensed Practical Nurse (LPN) Licensed Practical Nurse

Added Functions

function_name function_type description comment examples
standardize_title scalar Returns the BLS standard title using TF-IDF NULL [SELECT standardize_title(scraped_title_column) FROM your_table;]