Könnte verfügbar sein
(Aktualisiert 2023-09-12)Data Specialist
Sheridan, WY, USA
Einheimische English, Anfänger French
- 7+ years experience with Python
- 4+ years experience with Cloud services AWS, GCP
- 5+ years experience with BI tools, Tableau
Fähigkeiten (27)
SCRAPING
Python
Git
DATA ANALYTICS
DATA REPORTING
QUALITY CHECKS
BI
QA
DATA VALIDATION
WEB SCRAPING
Data Analysis
SCRIPTING
Google Docs
Quality Engineer
PIPELINE
DATA SCIENCE
JavaScript
pytest
CI/CD
VISUALIZATION
API
PySpark
Business Intelligence
MARKETING ANALYSIS
GCP
Flask
CONTINUOUS INTEGRATION/DELIVERY
Zusammenfassung
8+ years experience working with Python;
- 5 years of experience as a BI and 4 years of experience with Tableau;
- 8 years of experience with various data sets (ETL, Data Engineer, Data
Quality Engineer);
- 3 years of experience with Amazon Web Services (AWS), Google Cloud
Platform (GCP);
- Data Analytics/Engineering with Cloud Service Providers (AWS, GCP)
- Experience working with MySQL, SQL, and PostgreSQL;
- Deep abilities working with Kubernetes (K8s);
- Hands-on scripting experience with Python; Microsoft Power BI, Tableau,
Sisense, CI/CD principles, Data Validation, Data QA, SQL, Pipelines, ETL, and
Automated web scraping.
- Pet web3 projects (solidity, wallet integration)
- Upper-intermediate English
TECHNICAL STACK
Main Technical Python (8.5 yr.), Data Analysis (6 yr.), Google Cloud
Skills Platform (GCP) (4 yr.), Tableau (4 yr.), Microsoft Power BI
(4 yr.)
Programming Python (8.5 yr.), JavaScript
Languages
Python Flask, pandas, ScikitLearn, Python Pickle, Django
Frameworks and Channels, PySpark
Libraries
Data Data Analysis (6 yr.), Data Testing (3 yr.), ETL, Apache
Technologies/ Spark, Data Scraping, Data Modelling, Data Mining, Apache
Analysis/ Airflow, Tableau (4 yr.), Microsoft Power BI (4 yr.)
Visualization
Databases & SQL, ETL, Apache Spark, MySQL, PostgreSQL, Microsoft
Management SQL Server, AWS ElasticSearch, DynamoDB, RDBMS
Systems, ORM
AI & Machine Machine Learning
Learning
Cloud Platforms, Amazon Web Services (AWS) (3 yr.), Google Cloud Platform
Services & (GCP) (4 yr.), Heroku
Computing
Amazon Web AWS ElasticSearch, AWS S3
Services
QA & Test Selenium Webdriver, Unit Testing
Automation
Version Control Git
Operating Linux
Systems
SDK, API and RESTful API
Integrations
Deployment, CI/ Pipeline, CI/CD, Kubernetes (K8s)
CD, DevOps,
Administration
Other Technical Data Engineering (6 yr.), Team Collaboration (7 yr.),
Skills BIGData, Streamlit, Robotic Process Automation, Cronjob,
Data pipelines / ETL, Parallelization, datasets, Machine,
learning, web, RPA, Sisense
Berufserfahrung
2023-05 - 2023-06
Responsibilities: • Web Scraper • Data Parser for PDF, Word, Google Docs • Machine Learning and text/content recognition • API • Automation testing for importing, high-load Technologies: AWS, Restful API, Python, Pytest, Alluire, JavaScript, Docker, Kubernetes,
2023-01 - 2023-05
Responsibilities • Design and develop Tableau dashboards; • Produce well-designed, efficient code by using the best software development practices; • Perform code reviews for compliance with the best engineering practices, coding standards, and quality criteria set for the projects; • Provide suggestions to improve the architecture, and coding practices, build/verification toolset, and solve customer problems.
2020-01 - 2022-08
Responsibilities: • Use Sisense to build dashboards for tracking updates to selected amazon store brands for determined time periods. I used the interactive SQL palette for querying the tables to filter the needed information (columns) to be displayed in the dashboard. This dashboard provides the data engineering manager with the necessary information to make decisions on store brands.
• Create and support ELT data pipelines built on Snowflake and DBT while ensuring high-quality data • Develop and deploy data warehousing models, and support existing processes/ETLs (extract/transform/load), and functions (in Python/SQL/ DBT) in a cloud data warehouse environment using Snowflake, AWS services • SQL statements and developing in Python • Design and develop data pipelines (DAGs). Automation tests.
Technologies: Sisense, AirFlow, Snowflake, Python3, AWS S3, Hadoop, Spark, GitLab CI/CD, Kubernetes, LDAP.
2020-09 - 2021-07
• Set up and manage web-based cloud services on AWS EC2.
2020-03 - 2021-03
• Develop a curriculum to be used for teaching python programming and data analysis.
Python Developer, IoT-leveraged agricultural tech company A project on monitoring and reporting sample data from agricultural plants on a field of land.
May 2020 - August 2020 Responsibilities: • Hands-on setting up, maintaining, and deploying services to AWS EC2.
• Automated web scraping of data from webpages using Selenium.
• Carried out multi-processing and parallelizing of code with PySpark.
• Used Spark for 2 cases of data processing in an ELT phase: 1. Data was collected from drones and other specialized bots were used to physically survey the land area and take samples from the soil and air for properties such as soil pH, moisture content, specific gravity, etc for different types of crops planted on the field. This data was received gotten in real-time, and placed on a queue to be loaded into AWS DynamoDB. The transformation involves converting some data properties from the queue such as temperature from degree celsius to the kelvin scale, moisture content from cubic centimeters to cubic meters, etc. The transformed data is then loaded into AWS s3.
2. Process large batch data averaging 10 million rows with spark: There were cases where I had to transform data on a different database containing historical data to consolidate the currently maintained tables in another database. The historical data contains millions of rows of IoT-generated values. To optimize speed and memory usage for transformation, I used python's implementation of Spark (Pyspark) to carry out the same transformation technique on the batch data to backfill the current table in the database.
2019-07 - 2021-01
• Provided daily and historical data reports and visualizations to the technical director. Daily and historical reports included tracking the coverage of data collection in geographical areas, and providing updates on data quality checks and target data samples.
• Developed and maintained cloud services on the Google cloud platform.
• Developed questionnaire scripts on ODK for market research.
• Led a data collection team of 10 people.
• Performed data analysis using data tools, visualizations, and dashboards.
• Used PowerBI and Tableau to provide daily and historical data reports and visualizations to the technical director. Daily and historical reports included tracking the coverage of data collection in geographical areas, providing updates on data quality checks and target data samples.
2019-01 - 2019-04
• Maintaining SQL databases for proper scaling.
• Ensuring proper test units are integrated to promote clean codes.
2017-01 - 2017-01
• Carried out analytics with Microsoft Azure for prediction models.
• Generated various visualization models for data analytics with Power Bi and Seaborn.
2016-07 - 2016-12
2015-04 - 2015-07
Tableau Experience Highlights: 1. Real Estate Project • Developed and maintained predictive algorithms for US house prices using machine learning techniques such as regression and classification • Created interactive data visualizations for real estate agents and investors using Tableau • Analyzed a variety of data points on comparables for single-family homes and condos, including location, property age, and amenities • Assessed factors like ARV (After Repair Value), square footage, year built, number of beds and baths, garages, and local market conditions • Developed user-friendly dashboards to display real-time market trends and property values, enabling investors to make informed decisions quickly • Collaborated with a team of data scientists and engineers to continuously improve algorithms and visualizations 2. Tableau Specialist in Market Research Project: • Utilized Tableau for comprehensive daily and historical data reporting and visualization to support decision-making processes • Provided data insights and visualizations to the technical director, enabling a better understanding of market dynamics and trends • Created a range of custom dashboards for daily and historical reports that covered various aspects such as sales, customer demographics, and product performance • Monitored and analyzed data collection coverage in target geographical areas to ensure accurate representation of the market • Conducted regular data quality checks, including data validation and cleaning, to maintain high data accuracy and reliability • Collaborated with data engineers and analysts to optimize data collection methods and improve overall data quality
Akademischer Hintergrund
2018-01 - 2019-01
2013-01 - 2016-01