Job Description
Stylumia is Hiring for Data Scraping Engineer
Who We Are:
At Stylumia, we’re all about making fashion and lifestyle retail smarter and better for everyone. We believe that by using the right data and technology, people and businesses in this industry can tackle tough problems and make positive changes in the world. When we started in 2015-16, we noticed that the way fashion trends were predicted and products were planned was often based on guesswork, not facts. This led to a lot of waste – over $750 billion worth every year globally! We realized that a big part of this waste came from making bad decisions. So, we decided to create Stylumia, a company focused on using advanced technology to solve these retail problems. Our goal is to help people in the fashion industry make smarter decisions and reduce waste by giving them insights based on real consumer data.
Skills & Responsibilities:
Skills Needed:
- Good at programming, especially in Python and Node.js.
- Know-how on web scraping, using tools like BeautifulSoup, Scrapy, and Puppeteer.
- Understand databases and can work with SQL.
- Familiar with Git for version control.
- Knowledge of caching systems like Redis.
- Experience with search engines like Elasticsearch.
- Great problem-solving skills, especially with complex data.
- Pay close attention to detail and make sure data is accurate.
- Can communicate well and work in a team.
- Understand data privacy and security.
Responsibilities:
- Write and maintain scripts to gather data from different sources.
- Work with the data team to figure out what data is needed and how to get it.
- Check the quality of the data to make sure it’s correct.
- Make the scraping process faster and better.
- Fix any problems with gathering data.
- Set up and manage databases and other systems for storing data.
- Work with other teams to use the scraped data in their projects.
- Keep up to date with the latest techniques and tools for web scraping.
Preferred Qualifications:
- Experience with gathering data from many sources at once.
- Know-how on cleaning and preparing data.
- Understand cloud platforms like AWS, GCP, or Azure.
- Experience with Docker for containerization.
- Familiar with data visualization tools.