Jump to content


 


Register a free account to unlock additional features at BleepingComputer.com
Welcome to BleepingComputer, a free community where people like yourself come together to discuss and learn how to use their computers. Using the site is easy and fun. As a guest, you can browse and view the various discussions in the forums, but can not create a new topic or reply to an existing one unless you are logged in. Other benefits of registering an account are subscribing to topics and forums, creating a blog, and having no ads shown anywhere on the site.


Click here to Register a free account now! or read our Welcome Guide to learn how to use this site.

Photo

Significance of scraped IP addresses


  • Please log in to reply
1 reply to this topic

#1 KSGuy

KSGuy

  • Members
  • 1 posts
  • OFFLINE
  •  
  • Local time:03:17 AM

Posted 08 December 2018 - 01:47 PM

This is probably way out of the norm and I apologize if not appropriate to this forum.  I need some help from someone who understands internal/external IP addresses and scraping and that is not me.

 

I do mostly data analytics, data modeling work. A good portion of what I do involves building classification models for predicting categories to place data into based on the known categorization of other data.  I was recently handed a data set and asked to just look at it and figure out what I can from the data with no known data sets to use as comps.  I have no known concept of "normal" for anything I am looking at.  Two variables in this set are the IP address and web browser scraped from the online application and/or registration submitted.  In this particular case, we are looking for potentially fraudulent submittals either multiple applicants and/or identity theft.  Most of the IPs are non-duplicates. Several of them are duplicates, triplicates, and some repeat into the hundreds.  Since neither of us has any real-world networking experience we have a difference in opinion on the relevance of this.

 

Two schools of thought:

  • The duplicates in the hundreds are simply external IP to the same internet provider or neighborhood or something and they are subnetted to individual users after that and not of a great concern and we should be more concerned with 3,4,5 from the same IP
  • The duplicates in the hundreds are likely to be someone who has purchased stolen identities and is committing large-scale fraud

Yes, we could do some correlating and make an educated guess but with zero real-world knowledge and the unknown validity of the other data given either scenario, it seems better to not waste a lot of time making assumptions if I can go to someone with the actual knowledge.  



BC AdBot (Login to Remove)

 


#2 Kilroy

Kilroy

  • BC Advisor
  • 3,476 posts
  • OFFLINE
  •  
  • Gender:Male
  • Location:Launderdale, MN
  • Local time:03:17 AM

Posted Yesterday, 03:32 PM

Both schools of thought are accurate.  For instance a business may only have one external IP address, but hundreds of computers.

 

Internal IP addresses are non-routable (if they get out on the Internet they can't go anywhere), 192.168,x.x and 10.x.x.x are the most common.  A public IP address is routable on the Internet.  A person can find their public IP address by going to What is my IP.com.  This is the IP that is reported to sites that you go to, the IPs in your data.  Multiple private (internal) IP addresses can share a public IP address, the cause of your problem.

 

One would think that an IP address with hundreds of repeats with the same browser, especially if the version were captured, would be fraud and that would most likely be the way to go if you only have the two pieces of data to work from.






0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users