I will develop an scraper in NodeJS that gets events, categories, odds from 3 different sources continously (they all have same event_ids for events),Every scraper needs seperate process and scraping will be like that:Get events from JSON response, asynchronously iterate through events and check if it exists on database by looking up for event_id field, if it exists update, otherwise insertAfter that, asynchronously iterate through odds, check if it exists on database, if exists then update odd_value_scrapername otherwise insert odd_value_scrapernameMake a selectable column/index with highest odd value from the scrapers (ex.: if odd_value_scraper1 is 1.40, odd_value_scraper2 is 1.50, odd_value_scraper3 is 1.65, in this case, highest_odd column value will be 1.65 for that event)Stream odd changes/inserts via websockets to the clients (note that it will stream only highest_odd column as odd value)These 3 sources have 2k events, 10k odds at total. Every minute the odd values are changed so there will be 24k~ selects/updates and maybe inserts per minute on database. Please note that odd updating requires no locks on database as each scraper will be updating their own odd_value_scrapername columns of same event_idsWhats best and server resources-safe way to achieve this? What database engine fits in these needs?Thanks
Submitted February 09, 2019 at 12:46PM by fringilla
No comments:
Post a Comment