The aim of the project was to develop a model that extracts information of occupational fields and employees’ skills according to job advertisements from a job platform.
The project is based on a data set of posted job announcements obtained by the web crawling.
A lot of job platforms prohibit web crawling. Additionally, the used portal made permanently changes of a source code what interferes crawling as well.
A structured data set contained descriptions of positions, names of companies, locations and dates of publication was created by the Python-based web crawler. The database was available for download as a CSV file.
The occupational fields and skills were labeled with the help of the Dataturks tool and downloaded as a JSON format.
The obtained results are presented in a user-friendly dashboard. The user can select a company, skills or occupational fields depending on individual needs. Filtering by companies will provide a diagram where sizes of rectangles correspond to the number of opened positions in that company.
By selecting a specific skill or experience, a relational amount of opened positions in all companies will be displayed. Further analysis as combined skills can be carried out as well.
The created algorithm can be extended to scan the huge amount of candidates resumes a company obtains.
The results presents a solution to economize companies’ resources for market research.