What follows is a high-level analysis of screen scraping / web scraping strategies and frameworks for golang, current as of July 2019. State of Affairs Web scraping spans a very broad range of activity including everything from archiving content, search engine indexing, spiders and crawlers, ETL (extract, transform, load) workflows, the parsing of public json, rss, xml feeds and html pages, sophisticated bots and machine learning protocols which emulate a human with a web browser, and acceptance testing and QA (quality assurance) workflows.
It is possible to access a service such as PostgreSQL on the localhost of an Ubuntu/Linux host machine from inside of a docker container with a bit of configuration. Networking Configuration First thing here is to understand that there are different docker networking modes, and the method you use to connect to a host service from inside the docker container will differ depending on the networking mode of the container.
The following is a write-up I did while working on the ScoutRed ETL pipeline and attempting to find the best way to serialize and store dates when the data source comes primarily from paper-form based data where we don’t necessarily know (or want to know) the timezone or locality. A date is defined here as an entity which includes year, month, and day, but does not include any information about the time or timezone.