A key point within the lifecycle is the availability of the data and in what format.  There is clean prepared data sets that are available from a wide variety of sources both internally of companies and externally.  Here you will hear us talk about 'by hook or by crook' which means we'll get the data via automated, semi-automated or manual methods.

Automated methods of data access is the gold standard and can come in a wide variety of methods (e.g., API, XML, Database access) but simply requires a human to write a script and you'll get the data.

Semi-Automated access means one may download the data manually but then process and prepare it via some script or automated methods

Means the data is not accessible and you need to go generate and get it.  This could lead to automated access in the case of new generation of content (e.g., experiments, survey's) but technically the data does not exist and you have to figure out how to create it.