Resources to learn about engineering aspects of data analytics (OLAP, warehousing, ETL, etc.)
- by JT
I'm a math/stats guy, interested in learning more about the engineering aspects of "data analytics" (this may be an overly broad term, this is a case of "I don't know what I don't know", so I'm not sure how to be more specific).
I'm fine with manipulating and analyzing the data once it's already stored somewhere and I can access it, and I'm fine with writing scripts and SQL queries (and have a general knowledge of things like normalization). What I don't know is the whole engineering process of capturing and storing the data. For example, terms I've heard thrown about that I only vaguely understand the meaning of include:
- OLAP, OLTP
- Data warehousing
- ETL
- ???
What's a good book (or any other resource) to learn about these kinds of things? What are things I should know about database design (normalization seems kinda "obvious" to me, something I would have done even before I knew the term -- is there anything else?)?
In other words, for jobs falling under the umbrella term of "analytics engineer", what kinds of things should I know?