
Natural Language Processing Applied on a Job Platform
Project scope
The aim of the project was to develop a model that extracts information of occupational fields and employees’ skills according to job advertisements from a job platform.
Data Sets
The project is based on a data set of posted job announcements obtained by the web crawling.
Challenges and solutions
A lot of job platforms prohibit web crawling. Additionally, the used portal made permanently changes of a source code what interferes crawling as well.
Applied Methods
A structured data set contained descriptions of positions, names of companies, locations and dates of publication was created by the Python-based web crawler. The database was available for download as a CSV file.
The occupational fields and skills were labeled with the help of the Dataturks tool and downloaded as a JSON format.

Project outcome
The obtained results are presented in a user-friendly dashboard. The user can select a company, skills or occupational fields depending on individual needs. Filtering by companies will provide a diagram where sizes of rectangles correspond to the number of opened positions in that company.
By selecting a specific skill or experience, a relational amount of opened positions in all companies will be displayed. Further analysis as combined skills can be carried out as well.
The created algorithm can be extended to scan the huge amount of candidates resumes a company obtains.
The results presents a solution to economize companies’ resources for market research.
Category
NLP
Technologies
Webcrawling
Natural Language Processing
Deep Learning
Power BI