Fortunately, they are useful python libraries that can be used to solve this problem. My favorite is Selenium which could be used to open a real web browser (i.e. Firefox or chrome) and automatize the used behavior like clicking on a link, filling a form, pressing a button, etc. Many selenium tutorials can be found on Guru99. The only thing that we have to do is to inspect the page (as explained in the previous blog) in order to identify the name of the elements of interest on the web page and tell selenium the sequence of action we want him to do on a page. Actually, our work is even more simplified by additional web browser plugins like Selenium IDE which could be integrated into your Firefox browser to record (and later export) all the actions you make on a web page. This allows to very quickly automatize repetitive behavior on the web.
Below is a small example of the selenium capabilities. In this demo, we simply open a Firefox web browser on the google page and search for the term "selenium".
import os,time from selenium import webdriver from selenium.webdriver.common.desired_capabilities import DesiredCapabilities from selenium.webdriver.common.keys import Keys #initialize the web browser os.environ["PATH"] += ":/PathTo/geckodriver" #needed for recent firefox versions firefox_capabilities = DesiredCapabilities.FIREFOX firefox_capabilities['marionette'] = True driver = webdriver.Firefox(capabilities=firefox_capabilities) driver.implicitly_wait(10) # timeout for page to load #go to google.com driver.get("http://www.google.com") #find the searchbox and put "selenium" text in it driver.find_element_by_id("lst-ib").send_keys("selenium") #wait a little bit for the text to be sent time.sleep(1) #press the search button (this will move us to the results page) driver.find_element_by_id("lst-ib").send_keys(Keys.RETURN) #convert the results page to a beautifulSoup object and print it soup = BeautifulSoup(driver.page_source, "lxml") print(soup.body.get_text(" ")) #wait 1min before closing the window (this is just for the demo) time.sleep(60) driver.quit()
So, this is it for the technical aspect of this blog post, we can now move to the logic of today's showcase. Again, I am not going to give specific code I used in order to preserve the server of the classified advertising website I've used. However, I can speak about the logic of the algorithm and about the results I've got. For collecting the data about the land selling market in Belgium, here are the typical actions we want to perform:
- Search for land for building opportunities in a Belgian city (based on a zip code)
- Find the search form on the page
- Select "land for building" as type of good
- Fill the zip code in the search box
- Press the search button
- Wait for the results page to load
- Parse the result pages
- Convert the page to a BeautifulSoup object (as in the previous blog post) and iterate on all the element of interest we want to gather
- Search for a "next page" link and click the link if it exist
- Go to step 3.1. and iterate until all results page even been downloaded
- Go to step 1. and iterate with another zip code
As you can see the logic is rather simple and thanks to Selenium IDE, all these actions can be recorded in a couple of minutes. Then the only thing to do is to integrate all this in a python loop on city zip codes we want to analyze. I processed all post in all Belgian cities during the month of October and collected for each the description of the good, the surface of the land, the selling price, the zip code and the town. It took less than two hours to collect all the data, corresponding to approximately 10.000 posts. You can find bellow some figures made out of these data.
The number of posts collected per Belgian city zone (white indicate no post found).
Average surface (in m²) of the lands for buildings being sold in each city zone.
Average price (in euro) of the lands for buildings being sold in each city zone.
The average price per surface (€/m²) of the lands for buildings being sold in each city zone.
The last figure can be compared to the official figure made by the Belgian government in 2014. In can be seen that despite we used data from only one data harvesting in October 2016, the two figures are very similar and we can reproduce all the trends that are observed on the official figure: high price in Brussels and on the Belgian coast. The scale of the price is also quite comparable. We can also notice that the size of the lands being sold is much larger in Wallonia compare to Flanders, but the price/m² is also much lower.
Now that we have meaningful data, we can start looking for the best investment. To do so, we will look for the land which as a price/m² significantly lower than the average for its town. In order to accommodate with the limited statistics we have, we will only consider towns for which we have at least 5 offers (in order to have a reasonable error on the average). Bellow is the list of the 25 best investments you can make according to the average of the price in the town. In the top 10, nine goods are located in Flanders which is certainly meaning something...
Land of 1614 m² to sell at 215000 € (133.21 €/m²) in 1860 meise (average for the town is 352.58+- 66.56 €/m²) Land of 1445 m² to sell at 125000 € ( 86.51 €/m²) in 3950 bocholt (average for the town is 152.01+- 20.83 €/m²) Land of 15950 m² to sell at 595000 € ( 37.30 €/m²) in 2550 kontich (average for the town is 524.20+-155.83 €/m²) Land of 3326 m² to sell at 144000 € ( 43.30 €/m²) in 3320 hoegaarden (average for the town is 232.71+- 61.52 €/m²) Land of 2013 m² to sell at 107000 € ( 53.15 €/m²) in 3560 lummen (average for the town is 173.30+- 39.51 €/m²) Land of 2487 m² to sell at 175000 € ( 70.37 €/m²) in 3520 zonhoven (average for the town is 179.41+- 36.29 €/m²) Land of 1810 m² to sell at 135000 € ( 74.59 €/m²) in 3990 peer (average for the town is 177.42+- 34.83 €/m²) Land of 2680 m² to sell at 90000 € ( 33.58 €/m²) in 3970 bourg-leopold (average for the town is 167.06+- 46.53 €/m²) Land of 1386 m² to sell at 225000 € (162.34 €/m²) in 1860 meise (average for the town is 352.58+- 66.56 €/m²) Land of 3050 m² to sell at 57000 € ( 18.69 €/m²) in 5350 ohey (average for the town is 45.65+- 9.70 €/m²) Land of 10491 m² to sell at 32000 € ( 3.05 €/m²) in 6640 vaux-sur-sure (average for the town is 42.92+- 14.34 €/m²) Land of 14201 m² to sell at 90000 € ( 6.34 €/m²) in 6860 leglise (average for the town is 56.48+- 18.47 €/m²) Land of 7000 m² to sell at 165000 € ( 23.57 €/m²) in 1370 jodoigne-souveraine (average for the town is 85.28+- 23.38 €/m²) Land of 2156 m² to sell at 312000 € (144.71 €/m²) in 2870 breendonk (average for the town is 269.68+- 47.35 €/m²) Land of 1521 m² to sell at 128200 € ( 84.29 €/m²) in 3670 meeuwen-gruitrode (average for the town is 228.11+- 54.78 €/m²) Land of 6858 m² to sell at 399000 € ( 58.18 €/m²) in 2310 rijkevorsel (average for the town is 250.75+- 74.73 €/m²) Land of 6000 m² to sell at 280000 € ( 46.67 €/m²) in 1570 gammerages (average for the town is 221.27+- 68.20 €/m²) Land of 1858 m² to sell at 275000 € (148.01 €/m²) in 1780 wemmel (average for the town is 423.63+-110.28 €/m²) Land of 12370 m² to sell at 149000 € ( 12.05 €/m²) in 6470 sivry-rance (average for the town is 37.70+- 10.32 €/m²) Land of 1371 m² to sell at 125000 € ( 91.17 €/m²) in 3990 peer (average for the town is 177.42+- 34.83 €/m²) Land of 12000 m² to sell at 100000 € ( 8.33 €/m²) in 5377 somme-leuze (average for the town is 46.75+- 15.63 €/m²) Land of 13860 m² to sell at 35000 € ( 2.53 €/m²) in 4190 ferrieres (average for the town is 45.60+- 17.78 €/m²) Land of 770 m² to sell at 30000 € ( 38.96 €/m²) in 2235 hulshout (average for the town is 227.01+- 78.15 €/m²) Land of 7700 m² to sell at 84000 € ( 10.91 €/m²) in 6670 gouvy (average for the town is 39.97+- 12.15 €/m²)
Have you already faced similar type of issues ? Feel free to contact us, we'd love talking to you…
If you enjoyed reading this post, please like it. It doesn't cost you anything, but matters for me!