11/16/2023 0 Comments Grep ip address from file![]() ![]() With "head -n 5" you would get for instance only the first five lines. If you need a different number, just name your number with the "-n" command line switch. The tool “head” works like “tail”, but it gives you the first lines of some data instead of the last ones.Īnd just like tail, head will give you by default exactly ten lines of output. ![]() Just give sorted data to "uniq -c", and uniq will give you every uniq line it sees together with the number of occurrences:Ĭut -d " " -f 1 demolog | sort | uniq -c | sort -nr | head The third thing “uniq” can help you with is the way we wanna have “uniq” to work for us here: count the duplicates With “uniq” you can remove duplicates from text based data or you can just * identify (show) the duplicates without deleting them. If we wanna count the repeated IP addresses, we now simply need to count the duplicates of every single line we have.Įvery time you need to deal with duplicates at the Linux shell, the tool “uniq” is the way to go. (Do you remember that we wanna create a top 10 list of the IP addresses?) Count duplicate lines with uniq The next step now would be to count the lines of all the repeated IP addresses. To do this, we just feed the output of the cut command through “sort”:Īs you see from the excerpt above, some IP addresses are responsible for a single request only while others have hit the website multiple times. For our task it’s quite enough to sort the addresses in a simple alphabetical way. Sort lets you sort existing data in different ways - as for instance alphabetically or numerically. Now that we have extracted all the IP addresses, we need to put them in order to process them further.Īnd if you need to “put something in order” - if you need to sort something at the Linux command line - then the tool “sort” is the way to go. (and isn’t simplicity always king?) Put the lines in order with sort Site note: Although the data extraction with grep is way more powerful than extracting data fields with cut, I’ll stick with “cut” for the rest of this article. So we can simply call “cut” in the following way, to extract all the IP addresses from a log file called “demolog”:Īs you see - as soon as you can describe the data you are interested in as a regular expression, you can use “grep” to extract only the data of interest. And this first field is separated by a single space from the rest of the line.We are interested in the first field of every line.This works in our example here perfectly: This tool is a command you can use, if you want to extract fields from lines of text, if these fields are separated with a dedicated single character from each other. The first approach is to use the command-line tool “cut” for extracting the IP addresses. Where the first part of every line shows the IP address where the request came from.Īnd our goal is now, to take the whole log file and generate the top 10 IP addresses that sent the most requests to my web server.Īnd if we wanna have a top 10 list of the IP addresses, we first need to extract them from the log.įor this step I wanna show you two different approaches: Extract the IP addresses with cut Or $ grep -E "$(cidr2regex 192.168.0.0/18)" access_logīonus points if your answer also covers IPv6.IP-ADDRESS - REQUEST & REQUEST-INFORMATION Ideally, what I'd like is something like grep "" access_logĪ tool that converts a CIDR range into the appropriate regex would also be OK. Is there an easy way to select lines from a file that match any CIDR range?įancy regex extensions will be considered as will different tools (such as awk or perl if necessary but I want it to be a one-liner) if they make the job easier. It's easy enough to add the optional zeroes to the regex but it just makes the whole thing a little bit more difficult. Printers in particular seem to like the leading zeroes. These regexes ignore IP addresses that include leading zeros, such as 192.168.001.001, which isn't a problem in Apache log files but could be in other log files. This is easy for ranges that fall on the natural boundaries (/8, /16 and /24) but not so easy for other ranges such as /17 and /25. From time to time I want to grep CIDR ranges out of my Apache log files. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |