We're looking for a Web Scraping Expert to join our team. A savvy specialist of crawling systems, that is capable of ensuring a constant flow of "big data" from sources in all corners of the world.
Legentic employs a large scale data acquisition process. Your main task is to ensure an uninterrupted flow of data from the various sources we’re crawling. You will be working closely with the data team: advising them and identifying the issues they run into. Together you'll build tools that tackle existing impediments and develop techniques and software solution that increase the data yield and quality.
More specifically, we currently run a Java-based crawling system that uses various strategies for crawling the web. Your job is to understand this system, understand its limitations and come up with solutions that will improve the success rate of the crawlers.
You must therefore have demonstrable experience with crawling systems and frameworks.
Professional experience working with crawling systems/frameworks (Scrapy, Selenium, headless browsers, Puppeteer etc)
Good understanding of how the web works (HTTP, SPAs, etc.)
Solid knowledge and production experience with Python or Java
It would be great if you could tick off these boxes as well:
BSc or MSc in computer science, data science, engineering or similar discipline
Fluent in Docker
Experience writing async code
Experience with distributed crawling systems
As our Web Scraping Expert, you will have an interesting and challenging job. You will have a great opportunity to learn new skills and grow as an IT specialist, in a young, collaborative company culture. You will have colleagues in all across Europe and soon the US too.
Our techies are located in Helsinki, Finland; Oslo, Norway and Cluj, Romania. This time we would prefer that the Web Scraping Expert is based in either Helsinki or Cluj. A good cultural fit is the most important attribute for this role. Our virtual workplace enables a team member to employ a self-motivated, disciplined, highly responsive approach in achieving team success.
Enquiries: Iulia Druta – firstname.lastname@example.org