4D Lottery Data Collecter

DISCLAIMER: NO OFFENSIVE ACTIONS WERE DONE, CODES ARE PURELY FOR EDUCATION PURPOSE OF WHAT PYTHON CAN DO.


Background

Inspired by a friend who asked if it was possible to get historical data of 4D lottery numbers from the official website using a python script.

It got me thinking, can I apply and of my CTF knowledge to get those data?

Advanture

I did some research, and found out that a python libray called BeautifulSoup allows us to get specific tags within a given html output.

The lottery page uses a GET paramter to pass in the arguments which will then be processed by the back end server to get the data needed. The variable was passed in as a base64 encoded variable, and it was tagged with a "draw" number, as each lottery draw has a draw number.

Scripting and Testing

After watching a video on how is BeaustifulSoup used, I coded a sample script to get the latest draw, and it worked! So what's left was doing a historical lookup using the "draw numbers", and store the results in a text file for further analytics next! (Soup anyone?)

Lesson

Even though this is not a CTF related challenge or real world encounter, the concept of web scrapping in a non-offensive environment can be used in an offensive environment if the codes were tweaked. I personally think that it can be used for information gathering and reconnaissance on any target of interest. It could also be a CTF challenge... Hmm... Inspirations everyday!

The Code

NOTE: Url of the lottery site removed to avoid any trouble.

import requests
import base64
from bs4 import BeautifulSoup as soup
import time


#The draw number you want. Typically 4 digits
# EXAMPLE: draw_number = 4284
# Edit: using loop for draw_number to get multiple draws

for draw_number in range(1000,4284):
	time.sleep(1)
	#Build the base64 paramter
	try:
		param_builder = 'DrawNumber=' + str(draw_number)
		encoded_draw_number = base64.b64encode(param_builder.encode('utf-8'))
		get_param = encoded_draw_number.decode('utf-8')

		#Build the url
		url = '[REMOVED URL]' + get_param

		# Generated_list
		number_list = []

		#Invoke the GET request
		r = requests.get(url)

		#Read the request
		web_reply = r.text

		page_soup = soup(web_reply,"html.parser")
		base_table = (page_soup.findAll("div",{"class":"tables-wrap"}))[0]

		draw_date = str(base_table.findAll("th",{"class":"drawDate"})[0])
		draw_date_text = draw_date[(draw_date.find('<th class="drawDate">')+21):draw_date.find('</th>')]

		starter_table = base_table.findAll("td")

		for nums in starter_table:
			number = str(nums)
			number_list.append(number[(number.find('</td>')-4):number.find('</td>')])

		for items in number_list:
			with open('4d_history.txt','a') as file:
				 file.write(items + '\n')
	except:
		pass