Follow, retweet @dailypinster




Post new topic Reply to topic  [ 20 posts ]  Go to page Previous  1, 2
Author Message
 Post subject: Re: Scraping property pin posts
PostPosted: Sat Dec 06, 2014 9:34 am 
Online
Too Big to Fail

Joined: Sep 13, 2012
Posts: 4706
Sometimes I think OpenWindow is a poetry bot created as part of a psychosocial experiment.

http://botpoet.com/

_________________
"It's easy to confuse what is with what ought to be, especially when what is has worked out in your favour"
Tyrion Lannister


Top
 Profile  
 
 Post subject: Re: Scraping property pin posts
PostPosted: Wed Dec 17, 2014 9:03 am 
Offline
Too Big to Fail

Joined: Aug 22, 2008
Posts: 3548
Mantissa wrote:
Presumably you have a research proposal. Why don't you post it? It's not like it's going to retrospectively change what people post.

Also note that the Piston area exists in a limbo area - it's purposfully not scrapabke by google etc. Of course you could register an account to scrape it but that would seem like bad form to me.

What's your rationale for stating this? It is still a public forum, only you need to have a registered log in.

As an example, let's take one very good reason for scraping - analysing a user's posts in order to put together a case to prosecute 'hate speech' (and piston required to be accessed for this).
From section 2 of the Prohibition of Incitement To Hatred Act, 1989

2.—(1) It shall be an offence for a person—
(a) to publish or distribute written material,
(b) to use words, behave or display written material—
(i) in any place other than inside a private residence, or
(ii) inside a private residence so that the words, behaviour or material are heard or seen by persons outside the residence...
or
(c) to distribute, show or play a recording of visual images or sounds,
if the written material, words, behaviour, visual images or sounds, as the case may be, are threatening, abusive or insulting and are intended or, having regard to all the circumstances, are likely to stir up hatred.

Now, the law exists for a reason. I think it is extremely important that it applies online just as much as it applies in the physical domain. Perhaps even more so in view of certain tendencies stoked in the online domain... Just because the piston is only accessible through a login should not preclude it from being subject to normal law, perhaps analogous to above - "the words, behaviour or material are heard or seen by persons outside the residence...".

And there may be a multitude of other good and proper reasons for scraping a forum. The only concern I can think of, is if the process causes undue server load, likely to affect the normal operation of the site. But such consideration might easily be built into the software.


Top
 Profile  
 
 Post subject: Re: Scraping property pin posts
PostPosted: Wed Dec 17, 2014 10:41 am 
Offline
Nationalised
User avatar

Joined: Jan 4, 2013
Posts: 16983
Location: To the right of the decimal place
WTF? If the Gardai want to access the Piston to investigate whether a crime has been committed I would say fire away. But it's not the job of a private individual to scrape random internet forums in the hopes of uncovering evidence of a crime. Or do you think the Gardai should suck the entire Internet into a database to try and proactively uncover a crime in progress?

Anyway, that's tangential to the question. My issue is that the Piston is meant as a forum where bona fide Pin members can have a somewhat more private (or less public) discussion; hence it requires registration to access, and is not Google-scrapable. I think it would be bad form to access the Piston for a reason contrary to its intended purpose.

_________________
“If you're afraid - don't do it. If you're doing it - don't be afraid!” ― Genghis Khan

"Do, or do not; there is no try" -- Yoda


Top
 Profile  
 
 Post subject: Re: Scraping property pin posts
PostPosted: Wed Dec 17, 2014 11:33 am 
Offline
Too Big to Fail

Joined: Aug 22, 2008
Posts: 3548
Mantissa wrote:
WTF? If the Gardai want to access the Piston to investigate whether a crime has been committed I would say fire away. But it's not the job of a private individual to scrape random internet forums in the hopes of uncovering evidence of a crime. Or do you think the Gardai should suck the entire Internet into a database to try and proactively uncover a crime in progress?

Anyway, that's tangential to the question. My issue is that the Piston is meant as a forum where bona fide Pin members can have a somewhat more private (or less public) discussion; hence it requires registration to access, and is not Google-scrapable. I think it would be bad form to access the Piston for a reason contrary to its intended purpose.

Take the scraper on github above, which is user name focused. It is based on a quite different approach to a 'big data' type application with the capability to scrape "random" forums in whole, and of course you would need to derive meaning and context intelligently (which you would need to do if you were taking such a random approach) from extremely unstructured data etc.

In my example, what we're talking about is say a situation where someone on a forum comes across hate speech and decides it is their duty to report it, and uses one of these tools to help compile some evidence of their allegation. Exactly similar to how it would happen in the real world.

We are not talking about big brother style surveillance or pre-emptive policing. (Anyway, this is done much more 'sensibly' technically speaking directly through your Windows PC, Gmail accounts, iPhones etc. if you read the NSA internal documents - U.S. software deployed as espionage infrastructure etc).

And the piston is intended as a forum for off-topic subjects, unrelated to property. That is its 'intended purpose'. It is wholly public. If you want a more private conversation, you PM. There is no stipulation you must be a 'bona fide' member (whatever the hell that is). You just click the registration links and login. (Or you are invited by an administrator to a closed group or a hidden group if the talk turns to sedition or similar on here, and you are deemed 'trustworthy'. But someone given access to such a group could still scrape it if they decided to.)


Top
 Profile  
 
 Post subject: Re: Scraping property pin posts
PostPosted: Wed Dec 17, 2014 1:02 pm 
Offline
Nationalised
User avatar

Joined: Jan 1, 1970
Posts: 22463
Eschatologist wrote:
Sometimes I think OpenWindow is a poetry bot created as part of a psychosocial experiment.

http://botpoet.com/


I enjoyed that game but find the concept only useful in proving it's own futility.

_________________
Follow The Pin - https://twitter.com/dailypinster

"Politicians are always realistically maneuvering for the next election. They are obsolete as fundamental problem-solvers." - Buckminster Fuller

"I was comfortable with a couple of banks being married today, instead i wake up and find I'm married to the banks." - Catbear

Image


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 20 posts ]  Go to page Previous  1, 2


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Jump to:  

Follow, Retweet @dailypinster



Pyramid Built, Is Better Built! - Latest Property Discussions www.thepropertypin.com