Q. Log analysis and cybersecurity threats

A couple of weeks ago I launched my first site, BerryNews.org - an autonomous, self sufficient and self generating website with web scraping, python, pandas and a few other tweaks in the backend.

I initially spent about 4-5 weeks coding stuff as a pandemic project not to go insane from all of the staying indoors and one thing lead to another --> snowballing into a full blown site so I said to myself, why not buy a domain? I always wanted to have my own launch page and see what's it all about so I did it.

After almost a day of bashing my head on the table I managed to configure my DNS to pinpoint to my ip from where my router would re-direct all the traffic towards my apache2 web server, that you might already know is hosted on my raspberry.

SUCCESS! Going Public!

My site was on the world wide web and even though I had a horrible design (90's) at least it was working every, every box from the table contained a title, link, photo, date and of course category all scraped from various news sites from both Romania and the Netherlands. I even had a little developer's page in which I would set out my goals and gloat with my latest accomplishments or mope with the bugs I found and had to fix.

Over the next 2 weeks, just right up until the birth of this chapter, I spent many nights improving my design (giving the site a fresh new look), fixing bugs, adding new features, adding new news sources and optimizing my code with multi-threading and queueing, but I always felt like I missed something and even though I had building or implementing a traffic monitor on my list I didn't succeed in deploying ApacheGUI (probably because of the arm64 architecture of the raspberry CPU) but nevertheless it's a really cool project and very well built so big thanks to jrossi227.

One morning I woke up and as I usually do in the mornings I checked my website to see if everything was up and running ( I had it on my list to add an automatic email notification when something goes bad but never got to finishing the code - funny enough the code to send emails is done and if everything works fine I can receive a notification but I commented that part out, because nobody wants an email every 10 minutes saying that pages were correctly scraped, i just needed it to do the reverse ) and to my surprise the site was not accessiblem, nor was the raspberry pi - What the f***? Did my SD card get corrupted?

This brings us to the first subchapter:

A. Log analysis

Getting the logs is the easy part because I know my version of Apache, so they are usually stored in /var/log/apache2 but that might differ and they might be placed in /var/log/httpd/ so your best change is to run:

$ locate access.log

So if we view the files in log directory we can see the: access.log file and it's archives (access.log.x.gz) + error.log(which is self explanatory on what it stores) + it's archives(error.log.x.gz). If you would like to know more about archives please read this article.

Example:

ubuntu: CPU Temperature/ :ls -l /var/log/apache2
total 236
-rw-r----- 1 root adm     0 Feb 28 11:29 access.log
-rw-r----- 1 root adm 11421 Feb 26 15:45 access.log.1
-rw-r----- 1 root adm 13555 Feb 18 00:23 access.log.10.gz
-rw-r----- 1 root adm 11900 Feb 17 00:50 access.log.11.gz
-rw-r----- 1 root adm  1893 Feb 16 00:49 access.log.12.gz
-rw-r----- 1 root adm   392 Feb 13 22:37 access.log.13.gz
-rw-r----- 1 root adm   386 Feb 10 18:41 access.log.14.gz
-rw-r----- 1 root adm  5825 Feb 25 23:25 access.log.2.gz
-rw-r----- 1 root adm  2858 Feb 24 18:48 access.log.3.gz
-rw-r----- 1 root adm  9975 Feb 23 23:59 access.log.4.gz
-rw-r----- 1 root adm 23297 Feb 22 23:37 access.log.5.gz
-rw-r----- 1 root adm 10716 Feb 21 23:50 access.log.6.gz
-rw-r----- 1 root adm  8171 Feb 20 23:27 access.log.7.gz
-rw-r----- 1 root adm 39455 Feb 20 00:53 access.log.8.gz
-rw-r----- 1 root adm  6580 Feb 19 00:35 access.log.9.gz
-rw-r----- 1 root adm   587 Mar  1 16:37 error.log
-rw-r----- 1 root adm   704 Mar  1 10:10 error.log.1
-rw-r----- 1 root adm   396 Feb 19 01:00 error.log.10.gz
-rw-r----- 1 root adm   447 Feb 18 01:00 error.log.11.gz
-rw-r----- 1 root adm   453 Feb 17 01:00 error.log.12.gz
-rw-r----- 1 root adm   238 Feb 16 01:00 error.log.13.gz
-rw-r----- 1 root adm   307 Feb 15 16:32 error.log.14.gz
-rw-r----- 1 root adm   375 Feb 28 11:30 error.log.2.gz
-rw-r----- 1 root adm   517 Feb 26 00:00 error.log.3.gz
-rw-r----- 1 root adm   473 Feb 25 10:37 error.log.4.gz
-rw-r----- 1 root adm   479 Feb 24 00:00 error.log.5.gz
-rw-r----- 1 root adm  9576 Feb 23 00:00 error.log.6.gz
-rw-r----- 1 root adm   474 Feb 22 00:00 error.log.7.gz
-rw-r----- 1 root adm   547 Feb 21 00:00 error.log.8.gz
-rw-r----- 1 root adm   734 Feb 20 01:00 error.log.9.gz
-rw-r----- 1 root adm     0 Jan 16 13:20 other_vhosts_access.log

Interpreting logs

To open for reading a .log file we use cat.

Archive files cannot be opened for viewing with cat, they must be first unarchived but we are lazy and won't start tar -xvf-ing all those log files then not to mention that we have to delete the newly created .log files as they are already present in the directory.

To open for reading a .gz file we use zcat which also un-archives for viewing but does not leave breadcrumbs.

ubuntu: Log Analyzer/ :cat /var/log/apache2/access.log.1 | grep 192. | grep -v php
192.168.1.21 - - [26/Feb/2021:09:13:56 +0100] "GET / HTTP/1.1" 200 1289 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.182 Safari/537.36"
192.168.1.21 - - [26/Feb/2021:09:13:56 +0100] "GET /css/berry.css HTTP/1.1" 200 1051 "http://192.168.1.31/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.182 Safari/537.36"
192.168.1.21 - - [26/Feb/2021:09:14:47 +0100] "-" 408 0 "-" "-"
192.168.1.21 - - [26/Feb/2021:10:32:25 +0100] "GET /developer_blog.html? HTTP/1.1" 200 3134 "http://192.168.1.31/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.182 Safari/537.36"
192.168.1.21 - - [26/Feb/2021:10:32:26 +0100] "GET /css/developer_blog.css HTTP/1.1" 200 1046 "http://192.168.1.31/developer_blog.html?" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.182 Safari/537.36"
192.168.1.21 - - [26/Feb/2021:10:32:26 +0100] "GET /image/linkedin-black-logo.png HTTP/1.1" 200 5072 "http://192.168.1.31/developer_blog.html?" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.182 Safari/537.36"
192.168.1.21 - - [26/Feb/2021:10:32:26 +0100] "GET /image/github-black.png HTTP/1.1" 200 7397 "http://192.168.1.31/developer_blog.html?" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.182 Safari/537.36"
192.168.1.21 - - [26/Feb/2021:10:32:26 +0100] "GET /BerryNews.png HTTP/1.1" 404 490 "http://192.168.1.31/developer_blog.html?" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.182 Safari/537.36"
192.168.1.21 - - [26/Feb/2021:10:32:27 +0100] "GET /romania-ro.html? HTTP/1.1" 200 27948 "http://192.168.1.31/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.182 Safari/537.36"

I know logs will hit you like a wall of text and depending on their verposity they might also hit you like a ton of bricks but this is a simple example of a normal logging to the access file.

192.168.1.21 - - [26/Feb/2021:10:32:27 +0100] "GET /romania-ro.html? HTTP/1.1" 200 27948 "http://192.168.1.31/" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.182 Safari/537.36"

192.168.1.21 - is my internal ip

[26/Feb/2021:10:32:27 +0100] - the date in which the GET request was sent to my web server

"GET /romania-ro.html? HTTP/1.1" - GET command of romania-ro.html page

200 - 200 is the status, for more details check out HTTP Status Responses

27948 - 27948 is the pid - process id

"http://192.168.1.31/" - internal server which replied

"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.182 Safari/537.36" - Header of browser

This is all fine and dandy but what happens when you have a shittone of logs to go through? Some or requests are malicions but most are ok so I don't care about them: should we go through them line by line?

Well the answer to that is simple, and as I mentioned many times, I am lazy so NO -> thus I created a script log_analyzer.sh that will do that for me, worth mentioning is that I had to do a bit of manual processing before I figured out which is an ok GET and which is not.

The main exploits and brute force attacks this script is searching for are listed below, exactly the ones I found in my access.log files, I might have had more malicious visits than actual traffic on the site :)).

Next, I will write a short chapter on the types of malicious software and attacks we can stumble across, just as I have.

php | phpmyadmin

'GET /pma/scripts/setup.php HTTP/' - https://www.trustwave.com/en-us/resources/blogs/spiderlabs-blog/honeypot-alert-extensive-setupphp-scanning-detected/
xmlrpc.php - https://the-bilal-rizwan.medium.com/wordpress-xmlrpc-php-common-vulnerabilites-how-to-exploit-them-d8d3c8600b32
wlwmanifest.xml - https://community.cloudflare.com/t/is-there-a-way-to-prevent-wp-path-probing/204761/6
systembc - Ransomware operators use SystemBC RAT as off-the-shelf Tor backdoor - https://news.sophos.com/en-us/2020/12/16/systembc/
solr/admin/info/system?wt=json - https://forum.codeigniter.com/thread-75932.html
boaform - https://www.theregister.com/2020/04/16/fiber_routers_under_fire/
robots.txt - this is not malicious but it gave me a scare - https://developers.google.com/search/reference/robots_txt
'GET /manager/html/ HTTP/1.0' - https://serverfault.com/questions/384836/what-are-these-weird-access-requests
'GET /config/getuser?index=0 HTTP/1.1' - https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-25078
'GET /_ignition/execute-solution HTTP/1.1' - https://www.rapid7.com/db/modules/exploit/multi/scada/inductive_ignition_rce/
'GET /jenkins/login HTTP/1.0'- https://book.hacktricks.xyz/pentesting/pentesting-web/jenkins
'GET /setup.cgi?next_file=netgear.cfg' - https://www.exploit-db.com/exploits/43055

B. Web Security Vulnerabilities

I have to say that this sub-chapter is proving to be a bit of a headache because the subject of cyber security, web security, internet security, malware, cybersecurity types of attacks and so forth is so vast and ever changing that it's a challenge just to highlight a few cases because all are possible and probably in the ever-vulnerable world of internet.

As per PentaSecurity: Cyberattacks on web applications are increasingly common. As more and more governments and businesses move their services online, web applications become an easy target for cybercriminals. Web attacks are one of the biggest threats to corporate security and data security.

What is a web application?

A web application is an application program that is installed on a remote server and delivered through the internet, with the website being the user interface. Think about email, social media, and e-commerce sites – you are basically using these applications on the web without having the need to install it locally on your computer.

So in other words, a web application is exactly the type of service my website was providing.

As per Cisco: A cyberattack is a malicious and deliberate attempt by an individual or organization to breach the information system of another individual or organization. Usually, the attacker seeks some type of benefit from disrupting the victim’s network.

Cyber threats nowadays typically consist of the following types:

Advanced Persistent Threats
Phishing
Trojans
Botnets
Ransomware
Distributed Denial of Service (DDoS)
Wiper Attacks
Intellectual Property Theft
Theft of Money
Data Manipulation
Data Destruction
Spyware/Malware
Man in the Middle (MITM)
Drive-By Downloads
Malvertising
Rogue Software
Unpatched Software

Part of Spyware/Malware we have:

Cross Site scripting (XSS) attack - which is when a website has a vulnerability that allows the injection of scripts. Attackers exploit such vulnerabilities and inject malicious JavaScripts into the website’s database. When a user later requests these data, the user’s web browser would execute the malicious JavaScript. This would allow the attacker to steal the browser’s cookies for session hijacking1. Hackers can then use the session information to exploit additional vulnerabilities, possibly gain network information and control the user’s computer. This is especially critical for the corporate environment as one XSS attack could compromise the whole network.

But this was not the case as my site had no javascript, nor php.

OS command injection attack - An OS command injection is when attackers input operating system (OS) commands into the server that is running the web application. It differs from an SQL injection because it enters from the server-side instead of the application-side. However, the consequences are very similar to an SQL injection attack, where attackers can take full control of the application. Attackers can command the application to display sensitive information, as well as modifying and deleting data. The application can also be utilized to compromise other parts of the corporate network, leading to further attacks within the organization.

This was not the case as I was only displaying stuff, like a poster on a wall

LDAP injection attack - Lightweight Directory Access Protocol (LDAP) is a software protocol mostly used for corporate intranets. It enables anyone on the network to find resources from its directory, such as other individuals, devices, files, as well as usernames and passwords as part of a single sign-on (SSO) system. An LDAP injection attack is when a vulnerability allows attackers to send queries without proper validation. Attackers could then alter the queries to gain access to critical resources, leading to devastating consequences.

This again was not the case as there was no link between my database (which is mainly unused) and the website

Brute force attack - A brute force attack, sometimes called a password attack, is one of the simplest forms of web attacks. The hacker simply tries different combinations of usernames and passwords repeatedly until it logs into the user’s account. Take a standard eight-digit password, for example, 52 letters (uppercase and lowercase) and 10 digits provide 62 total possible characters, making a total of 628 = 2.1834011×1014 possible combinations. Of course, it would take years for a single computer to try all the combinations. But when hackers gain control of multiple computers or develop a powerful software-based computing engine, things can become very easy.

This was indeed the case as you can see in below snapshot

I removed the ' "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36" ' from the end because it was useless to prove my point.

208.168.239.183 - - [23/Feb/2021:04:47:23 +0100] "GET /phpmyadmin/ HTTP/1.1" 200 16536 
208.168.239.183 - - [23/Feb/2021:04:47:24 +0100] "GET /phpmyadmin/index.php?pma_username=popa3d&pma_password= HTTP/1.1" 200 16535 "-" 
208.168.239.183 - - [23/Feb/2021:04:47:25 +0100] "GET /phpmyadmin/index.php?pma_username=root&pma_password=bitnami HTTP/1.1" 200 16535 "-" 
208.168.239.183 - - [23/Feb/2021:04:47:26 +0100] "GET /phpmyadmin/index.php?pma_username=root&pma_password=root HTTP/1.1" 200 16535 "-" 
208.168.239.183 - - [23/Feb/2021:04:47:27 +0100] "GET /phpmyadmin/index.php?pma_username=test&pma_password=test HTTP/1.1" 200 16535 "-" 
208.168.239.183 - - [23/Feb/2021:04:47:27 +0100] "GET /phpmyadmin/index.php?pma_username=root&pma_password=mysql HTTP/1.1" 200 16535 "-" 
208.168.239.183 - - [23/Feb/2021:04:47:28 +0100] "GET /phpmyadmin/index.php?pma_username=root&pma_password= HTTP/1.1" 200 16535 "-" 
208.168.239.183 - - [23/Feb/2021:04:47:29 +0100] "GET /phpmyadmin/index.php?pma_username=root&pma_password=123456 HTTP/1.1" 200 16535 "-" 
208.168.239.183 - - [23/Feb/2021:04:47:30 +0100] "GET /phpmyadmin/index.php?pma_username=root&pma_password=123 HTTP/1.1" 200 16535 "-" 
208.168.239.183 - - [23/Feb/2021:04:47:31 +0100] "GET /phpmyadmin/index.php?pma_username=admin&pma_password=admin HTTP/1.1" 200 16535 "-" 
208.168.239.183 - - [23/Feb/2021:04:47:31 +0100] "GET /phpmyadmin/index.php?pma_username=root&pma_password=admin HTTP/1.1" 200 16535 "-" 
208.168.239.183 - - [23/Feb/2021:04:47:32 +0100] "GET /phpmyadmin/index.php?pma_username=root&pma_password=password HTTP/1.1" 200 16535 "-" 
208.168.239.183 - - [23/Feb/2021:04:47:33 +0100] "GET /phpmyadmin/index.php?pma_username=root&pma_password=toor HTTP/1.1" 200 16535 "-" 
208.168.239.183 - - [23/Feb/2021:04:47:34 +0100] "GET /phpmyadmin/index.php?pma_username=wordpress&pma_password=wordpress HTTP/1.1" 200 16535 "-" 
208.168.239.183 - - [23/Feb/2021:04:47:34 +0100] "GET /phpmyadmin/index.php?pma_username=root&pma_password=1234 HTTP/1.1" 200 16535 "-" 
208.168.239.183 - - [23/Feb/2021:04:47:35 +0100] "GET /phpmyadmin/index.php?pma_username=joomla&pma_password=joomla HTTP/1.1" 200 16535 "-" 
208.168.239.183 - - [23/Feb/2021:04:47:36 +0100] "GET /phpmyadmin/index.php?pma_username=root&pma_password=0 HTTP/1.1" 200 16535 "-" 
208.168.239.183 - - [23/Feb/2021:04:47:37 +0100] "GET /phpmyadmin/index.php?pma_username=root&pma_password=12345 HTTP/1.1" 200 16535 "-" 
208.168.239.183 - - [23/Feb/2021:04:47:37 +0100] "GET /phpmyadmin/index.php?pma_username=root&pma_password=test HTTP/1.1" 200 16535 "-" 
208.168.239.183 - - [23/Feb/2021:04:47:38 +0100] "GET /phpmyadmin/index.php?pma_username=user&pma_password=user HTTP/1.1" 200 16535 "-" 
208.168.239.183 - - [23/Feb/2021:04:47:39 +0100] "GET /phpmyadmin/index.php?pma_username=root&pma_password=letmein HTTP/1.1" 200 16535 "-" 
208.168.239.183 - - [23/Feb/2021:04:47:40 +0100] "GET /phpmyadmin/index.php?pma_username=root&pma_password=root123 HTTP/1.1" 200 16535 "-" 
208.168.239.183 - - [23/Feb/2021:04:47:40 +0100] "GET /phpmyadmin/index.php?pma_username=root&pma_password=dbadmin HTTP/1.1" 200 16535 "-" 
208.168.239.183 - - [23/Feb/2021:04:47:41 +0100] "GET /phpmyadmin/index.php?pma_username=root&pma_password=r00t HTTP/1.1" 200 16535 "-" 
208.168.239.183 - - [23/Feb/2021:04:47:42 +0100] "GET /phpmyadmin/index.php?pma_username=root&pma_password=qwerty HTTP/1.1" 200 16535 "-" 
208.168.239.183 - - [23/Feb/2021:04:47:43 +0100] "GET /phpmyadmin/index.php?pma_username=popa3d&pma_password=popa3d HTTP/1.1" 200 16535 "-" 
208.168.239.183 - - [23/Feb/2021:04:47:44 +0100] "GET /phpmyadmin/index.php?pma_username=root&pma_password=123456789 HTTP/1.1" 200 16535 "-"

There were many other IPs such as this, you might think that they are foolish and naive to try such things but funny enough they are successful in 34% of the cases, unbelievable but true!

There was one guy who literally tried breaking into my web server with his username and no password :)) - he sent like 56 requests in less than 20 seconds, this brings me to the last type of attack:

Denial-of-service (DoS)/ distributed-denial-of-service (DDoS) attack

A denial-of-service attack is when an attacker sends an enormous amount of traffic to a website in an attempt to overwhelm the hosting server to disrupt and even paralyze service. What’s more, for websites renting cloud servers with volume-based costing, they could be charged with an astronomical cost by the service provider. A distributed-denial-of-service is the same concept, except that this time, the hacker gains illegal control over a number of computers to launch the attack on a larger scale.

I'm going to stop this chapter here, as it it quite lengthy, long story short no damage was causes, I was not hacked but decided to keep it low until I start reinforcing and upgrading my security. My website was just presenting stuff like a wall keeps a posted, so all attacks hit the wall and nothing was stolen or infected. Also, I have no php on my page, nor do I have my website linked to my MySql database, but then again these attacks were not targeted, people scan the internet and look for vulnerable targets such as my website.

                                            **Congrats, you're done!**

Conclusion

We have learned about how to check logs and interpret the access.log file. We have also learned about cybersecurity and why it is important Finally, we went over how the internet is full of threats and my analysis of my first hand encounter on it. Please don't forget about the log analyzer, it's structure can definitely come in handy when dealing which such a situation.

If you hit a problem or have feedback (which is highly welcomed) please feel free to get in touch, more details in the footer.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Q. Log analysis and cybersecurity threats

A. Log analysis

Interpreting logs

B. Web Security Vulnerabilities

Conclusion

Contact:

🔗🌳 All-in-One

💼 BT LinkedIn

📩 Bogdan Tudorache

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Contact:

🔗🌳 All-in-One

💼 BT LinkedIn

📩 Bogdan Tudorache

Clone this wiki locally