Data Engineering - Career Path

Data Engineering - Career Path

These questions have been curated based on the interviews of various data Professionals in the industry as stated in the IBM Data engineering specialization. They will be helpful to anyone attending interviews for DE roles, it also gives a glance to brush up on skills.

Data Engineering Domain:

  1. Data integration
  2. Data Pipeline
  3. Data lake
  4. Data Warehouse
  5. Distributed systems
  6. Data

What employers look for in a Data Engineer

  1. Exposure to a breadth of data-related technologies
  2. Requirements can vary between jobs and roles
  3. Exposure to data sources such as relational DB, NoSQL DB, In-memory DB, and Key-value stores Experience in data movement processes such as :
  4. Moving data from RDBMS to NoSQL
  5. Pulling data from social media using APIs
  6. Loading data into analytical databases such as Hadoop
  7. Employers look for good analytical and problem-solving skills.
  8. Somebody who is inquisitive
  9. Asks additional questions to figure out the direction to take
  10. Can communicate really well
  11. Has a strong work ethic and owns what they do.
  12. SQL
  13. Data Modelling
  14. ETL Technologies
  15. Programming ( Python)
  16. Skills on RDBMS
  17. Expertise in schema design
  18. Ability to work on ETL and ELT processes
  19. Ability to handle streaming data
  20. Ability to handle multiple data formats and file formats
  21. Ability to work with web APIs and Web Scraping
  22. Basic data analytics skills
  23. Automation of routine work.

Work on:

  1. Building strong foundations
  2. SQL, Python, data modeling, and ETL Methodologies.
  3. Pay attention to hands-on experience.
  4. Leverage open source tools, and build hands-on projects.
  5. Come up with your own project
  6. Build a database
  7. Get involved with other people who are working in that area to learn from them.
  8. Learn DB internals.
  9. Procedural language such as Shell scripting, PL/SQL, or Perl.
  10. OOP in python
  11. Functional programming lang such as Scala
  12. Master at least one NoSQL DB, MongoDB, Cassandra or Neo4J
  13. Understand Web Scraping
  14. Understand how APIs work.
  • Cloud computing and cloud platforms: Amazon Web Services (AWS), Microsoft Azure, SpringCloud, GCS (Google Cloud Storage)
  • Data warehouse tools: Snowflake, Data Bricks, BigQuery, Redshift, Db2
  • Data pipeline tools: Apache Kafka, Apache Airflow, Luigi
  • Big data tools: Apache Hadoop, Apache Spark, Apache Hive
  • Operating systems: UNIX, Linux
  • Programming languages: SQL, Bash, Python, R, Java, C++
  • Databases - Cassandra, Microsoft SQL Server, MySQL, PostgreSQL, Amazon DynamoDB, Apache Solr, IBM Db2, MongoDB, neo4j, Oracle PL/SQL, PostgreSQL
  • Metadata management software - CA Erwin Data Modeler; Oracle Warehouse Builder; SAS Data Integration Server; Talend Data Fabric; Alation Data Catalog, SAP Information Steward, Azure Data Catalog, IBM Watson Knowledge Catalog, Oracle Enterprise Metadata Management (OEMM), Adaptive Metadata Manager, Unifi Data Catalog, data. world, and Informatica Enterprise Data Catalog
  • Agile software development methodologies
  • Version control - Git
  • Modelling and API development
  • Business intelligence and data analysis software - IBM Cognos Impromptu, MicroStrategy, Microsoft Power BI, Google Analytics, InsightSquared, Oracle Business Intelligence Enterprise Edition, Qlik Tech QlikView, ‎Sisense, ‎Tableau, ‎Dundas BI, ‎SAS Analytics, Domo, SAP Lumira