Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SNI support using Python requests for .url #988

Closed
wants to merge 2 commits into from

Conversation

anarcat
Copy link

@anarcat anarcat commented Jan 5, 2016

without this, SNI-enabled sites, which are becoming more and more
popular, are not displayed by the URL plugin

a good site to test with is: https://sni.velox.ch/

the requests API is similar enough to the web.get API to replace it,
but that is left to another pull request, as other plugins may not
require SNI support because they probably don't encounter the same
variety of sites as .url

without this, SNI-enabled sites, which are becoming more and more
popular, are not displayed by the URL plugin

a good site to test with is: https://sni.velox.ch/

the requests API is similar enough to the `web.get` API to replace it,
but that is left to another pull request, as other plugins may not
require SNI support because they probably don't encounter the same
variety of sites as `.url`
@anarcat
Copy link
Author

anarcat commented Jan 5, 2016

a sample run: peoplesmic is using web.get, wildcat is using requests.

18:10:05 <@anarcat> https://sni.velox.ch/
18:10:07 <wildcat> [ TLS SNI Test Site: *.sni.velox.ch ] - sni.velox.ch
18:10:13 <@anarcat> https://koumbit.org/
18:10:14 <wildcat> [ Koumbit.org | Pour un internet libre et solidaire ] - koumbit.org
18:10:17 <@anarcat> https://anarc.at/
18:10:17 <peoplesmic> [ À propos de moi ] - anarc.at
18:10:18 <wildcat> [ À propos de moi ] - anarc.at
18:10:19 <peoplesmic> [ Koumbit.org | Pour un internet libre et solidaire ] - koumbit.org
18:10:22 <@anarcat> https://groente.puscii.nl/test.html
18:10:23 <wildcat> [ '); drop tables;-- ] - groente.puscii.nl

notice how the older code is a solid (and consistent, during tests) 5 seconds slower than the requests equivalent, for https://koumbit.org. the discrepency is different for https://anarc.at/ because it's actually on the locall network.

this is because the closing() structure doesn't seem to be supported
in all cases. at least in request 2.8, the response.close() call
actually works, so we'll use that.

note that it fails in 2.4.3 (debian jessie/stable)
@maxpowa
Copy link
Contributor

maxpowa commented Jan 5, 2016

What about editing web.py to use requests instead of replacing web.py? Does this use the CA certs defined in the config file or system CA certs?

Edit: This also appears to remove the functionality that limited the amount of bytes to download, so does this now download as much as the page serves? This could be very problematic, especially when linking large files/pages.

Edit2: nevermind my edit. What if the url is not text?

@anarcat
Copy link
Author

anarcat commented Jan 5, 2016

  1. i thought i had less of a chance to make web.py use requests than to change just this one plugin, but i'd be happy to oblige
  2. i believe it would use the system CA certs, but that can be changed with verify=ca_certs i think. maybe an oversight on my part...
  3. i assumed the URL is text, and i believe it is an assumption shared with the current code

@anarcat
Copy link
Author

anarcat commented Jan 5, 2016

as for the max_bytes part, i believe this is actually faster than the previous version, because it will stop after a few lines instead of reading the whole 640kb of the page...

@anarcat
Copy link
Author

anarcat commented Jan 5, 2016

re 2. i don't quite understand why the IRC bot needs to define his own CA list. why not use the system ones?

... yet if that would be necessary, i believe the following patch would be sufficient:

diff --git a/sopel/modules/url.py b/sopel/modules/url.py
index 0d7aac5..41fd1c2 100644
--- a/sopel/modules/url.py
+++ b/sopel/modules/url.py
@@ -181,7 +181,7 @@ def check_callbacks(bot, trigger, url, run=True):

 def find_title(url):
     """Return the title for the given URL."""
-    response = requests.get(url, stream=True)
+    response = requests.get(url, stream=True, verify=ca_certs)
     try:
         content = ''
         for line in response.iter_lines(decode_unicode=True):

as an aside, ca_certs seems like a weird global to me... it's not in any imports in web.py?

@embolalia
Copy link
Contributor

I'll take a more detailed look at the PR later, but I just wanted to answer a few points real quick:

  • Rather than using requests in sopel.web, we're going to deprecate it and suggest using requests instead. The module is kind of messy anyway, and there's no need to have a wrapper around an already pretty good API. (This has already been talked about here and there, and is a longer term plan, but just hasn't been acted on because it's somewhat low priority.)
  • The ca_certs config setting is there because there's no reliable standardization on the location. It gets set somewhere in the bot's startup (when it's reading the config). Yes, this is hideous. But at least at the time, there was no cross-platform solution that didn't involve a hard dependency for core functionality (which we've been trying to avoid).
  • The current code does not assume the content is text for size limiting. It reads howevermany bytes (not characters), stops, and then assumes the bytes are text. If I'm reading it correctly, what your code is doing will require there to be an occasional 0x0A byte, or it will never hit the limit. What if it's running on a Raspberry Pi, and someone posts a link to a 2GB video file?

@anarcat
Copy link
Author

anarcat commented Jan 6, 2016

On 2016-01-06 08:08:52, Elsie Powell wrote:

I'll take a more detailed look at the PR later, but I just wanted to answer a few points real quick:

  • Rather than using requests in sopel.web, we're going to deprecate
    it and suggest using requests instead. The module is kind of messy
    anyway, and there's no need to have a wrapper around an already pretty
    good API. (This has already been talked about here and there, and is a
    longer term plan, but just hasn't been acted on because it's somewhat
    low priority.)

Cool. This is what I was hoping would happen. I was also hoping this
change would show a good way forward on how to proceed with this.

In my opinion, there's no good reason why an IRC bot should have its own
HTTP library when requests is around in Python. :) At least now that
requests has significantly stablised.

  • The ca_certs config setting is there because there's no reliable
    standardization on the location. It gets set somewhere in the bot's
    startup (when it's reading the config). Yes, this is hideous. But at
    least at the time, there was no cross-platform solution that didn't
    involve a hard dependency for core functionality (which we've been
    trying to avoid).

requests had exactly the same problem. they made the certifi package for
that exact purpose, after ripping it out of requests:

http://docs.python-requests.org/en/latest/user/advanced/#ca-certificates

basically, with requests, this is abstracted away to the distribution
(debian package of requests in my case or pypi dependencies in the
general case).

so basically, the deprecation of sopel.web would also deprecate
ca_certs.

  • The current code does not assume the content is text for size
    limiting. It reads howevermany bytes (not characters), stops, and
    then assumes the bytes are text. If I'm reading it correctly, what
    your code is doing will require there to be an occasional 0x0A byte,
    or it will never hit the limit. What if it's running on a Raspberry
    Pi, and someone posts a link to a 2GB video file?

iter_lines has its own chunk size, unspecified here, but that defaults
to 512 bytes. in other words, i believe iter_lines reads the minimum
number of bytes between 512 bytes and "whatever i need to read up to a
newline". so basically, we read 512 bytes chunks, looking for a </title>
tag, but stop after 640kbytes.

so yes, this will use up to 640kbytes of ram. it seemed rather high when
i was reading the code, especially considering the reference to all
the memory of original PCs. ;) i'd be happy to reduce that to a more
reasonable number, say like 128KB or 64KB (you guys have a quote from
Wozniak on that one? ;)

see:

http://docs.python-requests.org/en/latest/api/#requests.Response.iter_lines
http://docs.python-requests.org/en/latest/user/advanced/#streaming-requests
https://en.wikipedia.org/wiki/Conventional_memory
https://en.wikiquote.org/wiki/Talk:Bill_Gates
http://www.computerworld.com/article/2534312/operating-systems/the--640k--quote-won-t-go-away----but-did-gates-really-say-it-.html

... and so on.

a.

During times of universal deceit, telling the truth becomes a
revolutionary act. - Georges Orwell

@maxpowa
Copy link
Contributor

maxpowa commented Jan 6, 2016

I don't know enough about requests, but iter_lines attempts to decode to a string right? When we are looking at a 2GB video file, it wouldn't be able to decode anything. You'd never hit your limiting and you'd always download the entire file, from what I understand at least.

@anarcat
Copy link
Author

anarcat commented Jan 6, 2016

that is not what i read from the documentation.

but, only the source code would tell you that, right? in fact, reading the source code, iter_lines uses iter_content which in turn calls urllib3 (response.raw is a urllib3.response object):

https://github.com/kennethreitz/requests/blob/master/requests/models.py#L692
https://github.com/kennethreitz/requests/blob/master/requests/models.py#L645
https://github.com/kennethreitz/requests/blob/046a0f215de3fac60326ea1c2c9dd30beb329f61/requests/packages/urllib3/response.py#L263

then that reads at most the bytes specified and decodes the result. it won't read the whole 2GB file just to see if it can decode it better.

so i don't think those concerns are warranted. but then you can make a unit test to see if that's really a problem. in my tests, as i said, the requests version is faster than the native one, which can only be explained by the fact that it really does read only 512 bytes at a time, because otherwise the requests overhead would make it slower, not faster.

@maxpowa
Copy link
Contributor

maxpowa commented Jan 6, 2016

No I'm not saying it wouldn't do the 512 chunks, I'm saying it would do every 512 chunk. You'd still end up downloading 2GB on your network, rather than just 640kb.

content = ''
for line in response.iter_lines(decode_unicode=True):
content += line
if '</title>' in content or len(content) > max_bytes:
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't this catch such a case? wouldn't len(content) > max_bytes?

@anarcat
Copy link
Author

anarcat commented Jan 6, 2016

i guess i don't see the code path you are refering to. i mean if you are worried that iter_lines() just never returns until it read beyond 640KB, that is one case. i don't think it will happen, from reading the requests code. now if you think my code will loop beyond 640KB, that's a completely different case, yet i still don't see how it could happen. would len(content) be zero?

could you clarify the code path or provide a test case?

@maxpowa
Copy link
Contributor

maxpowa commented Jan 6, 2016

Just tested, I was thinking that something like http://mirror.internode.on.net/pub/test/1meg.test would cause it to download the entire thing (because len(content) would be 0), but it just fails out with the UnicodeDecodeError.

@elad661
Copy link
Contributor

elad661 commented Jan 11, 2016

If we use requests we can limit the amount of data downloaded, according to this stackoverflow link https://stackoverflow.com/questions/22346158/python-requests-how-to-limit-received-size-transfer-rate-and-or-total-time

@anarcat
Copy link
Author

anarcat commented Jan 11, 2016

On 2016-01-11 08:47:18, Elad Alfassa wrote:

If we use requests we can limit the amount of data downloaded, according to this stackoverflow link https://stackoverflow.com/questions/22346158/python-requests-how-to-limit-received-size-transfer-rate-and-or-total-time

This is essentially what the my code is proposing, except the example
above uses iter_content() instead of iter_lines() and also a timeout.

a.

Freedom is being able to make decisions that affect mainly you. Power
is being able to make decisions that affect others more than you. If
we confuse power with freedom, we will fail to uphold real freedom.
- Richard Stallman

@anarcat
Copy link
Author

anarcat commented Jan 24, 2016

from the 6.2 release notes:

sopel.web is now deprecated in favor of the third-party requests library

so what's up here? is there anything i can do here to push this ahead?

@elad661
Copy link
Contributor

elad661 commented Jan 29, 2016

So my hunch is that iter_lines will look for \n, if you happen to download binary content, you might not see any \n in the file at all, and it will just keep downloading. This is why I think iter_content is safer.

I still want this pull request merged, because we need to port away from web.py to requests, but I'll have to test it to see if my hunch is correct, and think of a way to fix this if it's not.

@anarcat
Copy link
Author

anarcat commented Jan 29, 2016

i thought i made it pretty clear in #988 (comment) that iter_lines calls iter_contents so it's all the same:

    def iter_lines(self, chunk_size=ITER_CHUNK_SIZE, decode_unicode=None, delimiter=None):
        # [...]
        for chunk in self.iter_content(chunk_size=chunk_size, decode_unicode=decode_unicode):
             # [...] line splitting...
            for line in lines:
                yield line

so if you have a large binary file, the code path will be:

  1. call iter_content(ITER_CHUNK_SIZE), that is, download 512 bytes
  2. split the result on newlines into a list of lines
  3. yield the list of lines

so if you call iter_lines() in an iterator context (which I am doing), it will not download the whole file. it would if you would do list(iter_lines()) but that would be silly. furthermore, i explicitely check the size of the chunks accumulated and abort after our own limit:

            if '</title>' in content or len(content) > max_bytes:

so you're welcome to follow that code path again, but i'm pretty confident it's doing the right thing.

@elad661
Copy link
Contributor

elad661 commented Jan 29, 2016

Okay, got it. I'll run a few tests just to make sure, and if it works well I'll merge it.

@anarcat
Copy link
Author

anarcat commented Jan 29, 2016

great, thanks! testing would be easier, btw, if the url module would talk when it doesn't find a title. :) just by the speed of the response you could tell if it's downloading the whole file or not... users in the channel often complain that they feel the bot is dead when it doesn't reply when they post a URL... most often it's because of SNI, but sometimes it's PDFs and people are surprised the bot is silent...

@elad661
Copy link
Contributor

elad661 commented Jan 29, 2016

in the past, it did, but it annoyed people when they posted links for things that can't be handled by this module, so we silenced it.

So now when working on the url module the best thing to do is to add debug prints while you're working, and remove them before committing.

@elad661
Copy link
Contributor

elad661 commented Jan 30, 2016

So far my main concern with this is that requests doesn't seem to support non-unicode stuff when the content-type is only defined within the html file itself. It makes sense for them, because they're an http library, not an html one. It tries to decode everything as unicode even if the Content-Type header doesn't specify a charset.

I'll try to find a solution.

@elad661
Copy link
Contributor

elad661 commented Jan 30, 2016

Oh. Looks like we didn't support this specific case with web.py either. I just wasted an hour researching it before realizing. Oops.

Anyway, the iter_lines solution is not good enough. On a binary file with not a lot of newline the result is that the thread hangs and the bot uses 100% CPU. We clearly don't want that.

This probably happens because iter_lines keeps looking for newlines in iter_content.

If we want to iterate over lines, we have two options: either implement iter_lines ourselves with the limit built in instead of using the existing requests code, or monkey patch iter_content to stop after max_bytes read. I'm not sure which is cleaner, but whatever we do we probably can't copy code from requests because of licensing issues.

Alternatively, we can just read the content until max_bytes is reached, and then split that by lines and process them. This is obviously easier, and I don't think the performance penalty will be too bad.

@embolalia what do you think?

elad661 pushed a commit that referenced this pull request Jan 30, 2016
Now that web.py is deprecated, we can port url.py to requests.

Originally from pull request #988, committed here with minor bugfixes
and modified commit message.
elad661 pushed a commit that referenced this pull request Jan 30, 2016
@elad661
Copy link
Contributor

elad661 commented Jan 30, 2016

Merged with a bugfix (not using iter_lines), squashed two commits into one, and reworded the commit message (since SNI is no longer the reason we wanted this in).

Unfortunately because I squashed the commits and reworded them, github doesn't know I just merged it, but I did.

4cceef6

Thanks!

@elad661 elad661 closed this Jan 30, 2016
@anarcat anarcat deleted the sni branch January 30, 2016 16:30
kwaaak added a commit to kwaaak/sopel that referenced this pull request Mar 22, 2018
* update movie.py to use omdbapi as imdbapi is non-functional

* minor cosmetic change

* Resolve sopel-irc#926

Also cleaned up dict value retrieval a little, the .get() calls were a bit unnecessary.

* Remove feedparser dep from requirements

* Remove feedparser from RPM spec

* Make CAD currency code case-insensitive

* Resolve sopel-irc#929

Ensures that FilenameAttribute's parse and serialize always have the parameters they need.

* [isup] fix bad indexes preventing bot from recognizing http protocols

`'http://'` is 7 characters long, so `site[:6]` can't ever match it. ditto for `'https://'` and `site[:7]`

* Switch back to get() calls

There's some places where there were already try catch blocks for KeyError, so I left those ones.

* core: Fix issues with reloading folder modules

Closes sopel-irc#899, obviates and closes sopel-irc#932.

* Switch from *args to named args.

* Properly fetch the xml before passing it to xmltodict

I'm a moron.

* Release 6.1.0

* Use flake8 for future checking, and add missing ones

* Fix coding declaration in a few places it was still wrong

Again, fuck windows. sopel-irc#821

* README: Update sopel package location in Arch

* Fix crash during configure

Reported by aam in IRC, http://pastebin.com/0rAQb7Kp

* Fix TypeError: 'NoneType' object is not iterable

Happened when an invalid language hint abbreviation was given.

* Require a valid phrase to translate

* [meetbot] fix crash when starting meeting

Resolves sopel-irc#942

* Release 6.1.1

* Require chanmsg for all adminchannel commands

* Fix sopel-irc#945

We should only get `socket.gaierror` if the user passes an invalid IP or hostname, but it could be in the event that we don't have DNS configured on the machine or a multitude of other things, so we can be somewhat vague in the error message.

* Fix sopel-irc#860

Should resolve the inconsistencies between the implementation of unquote in python 2 and python 3.

* [Formatting]Gray is the same as Grey. Which spelling is correct is a grey area.

* Updated link for help command

was a bad link, updated to correct one

* [countdown] fix remaining time calculation

* [docs] Rename

* Add tracking of users and their accounts

This uses some of the code that @maxpowa wrote for sopel-irc#941, but gives a
somewhat more intuitive API. It also paves the way for potentially
adding direct support for away-notify and metadata-notify.

* rename references to willie in systemd service file

* update project url in comments

* update project url in comment

* update project name in CONTRIBUTING.md (includes log files location, issue tracking url)

* Update github URL to organization, Sopel-IRC, rather than Embolalia

* [dice] Handle negative numbers and uppercase letters.

First, add case-insensitivity to the "d" and "v" letters in the input
regex by capturing [dD] and [vV] in regexes.

Next, make sure we can handle negative numbers properly by capturing
a possible "-" in front of any numbers. If dice_num or dice_type ends
up negative, abort _roll_dice early (dice_type already had a negative
check that wasn't being hit because "-" wasn't captured).

* [dice] Handle negative numbers and uppercase letters.

First, add case-insensitivity to the "d" and "v" letters in the input
regex by capturing [dD] and [vV] in regexes.

Next, make sure we can handle negative numbers properly by capturing
a possible "-" in front of any numbers. If dice_num or dice_type ends
up negative, abort _roll_dice early (dice_type already had a negative
check that wasn't being hit because "-" wasn't captured).

* [dice] Also handle negative drop_lowest values.

* Make BTC currency code case-insensitive

* Address comments on PR sopel-irc#961

This includes being more consistent about using pop rather than del to
prevent key errors, and adding some locking around the privilege related
things

* Add user tracking for RFC WHO replies

* Add away tracking

* Add support for account-tag

* Enable the account-tag capability

* Replace channels list with channels dictionary

Hopefully, nobody else is taking advantage of channels being a list,
rather than a dict. If they are, well, oops.

* Add enumeration of IRC events

Closes sopel-irc#960

* Add cap-notify support

See sopel-irc#971

* coretasks: Replace numeric events with their enums

Also add the missing RPL_WHOSPCRPL to tools.events

* [contrib] rename & edit willie out of contrib

Fixes sopel-irc#963

* Huge cleanup of copyright headers and docstrings

They're still super inconsistent and probably a lot are out of date, but
at least there won't be random copyright info showing up in the docs
anymore. Oh, and my domain and name are correct now, too…

* [currency] Make arguments case-insensitive (close sopel-irc#979)

* Fix URL excludes loading (sopel-irc#959)

setup uses it as a list, and in previous versions of sopel it was a list, but in the UrlSection it's defined as a ValidatedAttribute. This was causing each character in the excludes list to be parsed as a regex exclude. Switching to ListAttribute fixes the issue.

* [find_updates] Fix missing RPL_ prefix

* [bot] Fix misleading message

Coretasks is only one module, so if you loaded it and only one other, you'd get the "Couldn't load any modules" warning, even though there was a module loaded.

* [trigger] IRCv3 server-time support

* [tests] Create more trigger test cases

* [tests] Create module tests

* [tests] DB test cleanup

* [tests] Add formatting tests

* [tests] Fix coding declaration

* [tests] Update .travis.yml

* [translate] Fix bad test

* Support using account name for auth

* [tests] Remove pep8 dev-requirement

Also fixed the critical error where py.test thought sopel.py (the entry script) was the sopel package.

* [pep8] Minor clean up to conform with pep8

* [web] Ensure ca_certs is defined

When running tests it's not defined at all, because the only place defining it was the config loader code. Now it's defined, but will still fail out since it's not a valid CA certificate file.

* [pep8] Final pep8 run

Everything should now conform to pep8 and pass flake8.

* [tests] Improve trigger coverage

* [reddit] Prevent specific commands in PM

Resolves sopel-irc#789

* Documentation for target types

* docs/core: Shift around functions to make autodoccing easier

* docs: Start on a major cleanup of documentation

* Make _ssl_recv always return bytes

_ssl_recv returned empty strings instead of the empty bytes object if
the socket was closed or upon ENOENT.

This lead to exceptions when running sopel with python3 because
asynchat.handle_read expects byte objects.

This commit fixes sopel-irc#937.

* docs: Majorly overhaul organization and format

* Release 6.2.0

* Release 6.2.1

* dice: Allow comma delimiter

Closes sopel-irc#998

* adminchannel: Remove totally useless commands

Also add error messages to the somewhat useful ones

* Make it a bit harder to run into the LC_ALL thing

This behavior is stupid. Respecting LC_ALL, or anything else for that
matter, over the encoding fucking noted in the fucking file is a bad
decision, and someone should feel bad. I don't know why it makes things
break in the specific and bizarre way it does, but it does, and there's
no possible good reason for it.

Closes sopel-irc#984. Fuck.

* Escape nick before replacing it in regex

Resolves sopel-irc#1004

* [weather] Fix YQL woeid lookup

Handles an edge case that neither of the PRs handled

Fixes sopel-irc#1006
Closes sopel-irc#1007, sopel-irc#1012

* CONTRIBUTING: Let's drop the [brackets] thing and do what everyone else does

No point in being different from any other FOSS project out there.

* contrib/rpm: willie->sopel

* CONTRIBUTING: Update coding/future import guidelines

`# coding=utf-8` is now the standard in Sopel & supports windows. The future import now conforms with the flake8 future imports (also conforms with @embolalia's formatting passes a bit back)

* web: make web.py into a requests comaptibility layer

Since web is deprecated and everyone should switch to requests,
the first step is to make web.py a requests comaptibility layer.

When web.py was new, requests was not ripe enough to use in Sopel.

But now it's time to switch to requests like the rest of the python
echo-system. web.py is no more.

* find_updates: switch from web.py to requests

* web: Fix typo

This is why we have flake8 ;)

* url: port to requests

Now that web.py is deprecated, we can port url.py to requests.

Originally from pull request sopel-irc#988, committed here with minor bugfixes
and modified commit message.

* translate: port to requests

* movie: port to requests

* xkcd: port to requests

* Track the channel topic

* Improve locale stupidity checking

Thanks to @elad661's comment on b73fc6a

* url: handle capitalized URLs

Trigger rules are case-insensitive regexes, so the auto title responder
will be triggered even for capitalized URLs such as "Http://google.com"
(which can happen, for example, when a mobile device attempts to
auto-capitalize the beginning of a sentence).  Match URLs case
insensitively for title lookup purposes and add error handling in case
no URLs could be extracted from the match.

* core: Tweak rate limiting to be more effective

This doesn't solve the issue, but it should make it slightly less
critical. sopel-irc#952

* reddit: fetch posts by submission_id

Previously, the reddit module fetched posts by the full URL of the post.
This led to RedirectExceptions in some cases, for example when someone
links a naked reddit.com URL instead of www.reddit.com.  Instead, match
only the post ID and pass it to get_submission.

* Release 6.3.0

* Don't warn about non-UTF8 locales when running on Python 2

Python 2 doesn't change string behavior according to the locale env,
that's a py3 specific weirdness.

Also, reword the error message to better explain the issue to the user.

* Fix apparent typo in host_blocks initialization

* Added missing import to xkcd.py

* core: Fix print in handle_error when reaching the exception limit

Fixes issue sopel-irc#1025

* trigger: Fix target for QUIT events

This was fun to debug! Basically, Soepl encountered an exception
when removing unknown users when they QUIT. While this shouldn't
happen, it should still be handled gracefully.

Since it was an exception, Sopel's response was to try and send
the exception line to the channel (sender) the message came from, but
QUIT events don't come from channels (or users by PRIVMSG)!

Since QUIT was not special-cased, the naive assumption that
the first argument is the "sender" was used, and when Sopel
tried to send the exception line to the "sender", and the sender
had a space in it, this would lead to spam if a user exists with a nick
that is identical to the first word in the QUIT message. Ouch.

The fix special-cases QUIT in pretrigger to never have a "sender".
I also added a test to make sure we parse QUIT correctly.

This solves issue sopel-irc#1026.

* core: Never try to send an exception line when sender is None

Just in case.

* Make sure we're working with UTF8 string

Depending on the URL, response.iter_content() in Python 2/3 will return either:
- `type 'unicode'`/`class 'str'` (for "plain" HTML)
- `type 'str'`/`class 'bytes'` (for binary streams, like file URLs)

To distinguish between the two situations we're checking if we got string or
bytes, and proceed accordingly.

This also fixes sopel-irc#1021

* [doc] unify grammatical number of `@commands` example

It seems there is no `@command` decorator. 

At any rate, examples for `@commands` should use the same (and correct) grammatical number.

* Release 6.3.1

* FIX: Private BZ's - AttributeError: 'NoneType'

* [calc] Remove .wa, as API now requires a key

Will be moved to an external module that supports the new API

* search: Remove ad URL results

DDG changed their HTML output slightly and that threw us off, this *should* fix the r.search.yahoo.com URLs that .g was returning.

* Fix issue sopel-irc#1048

* fix .set command for non filename attributes

* Fix loading/reloading modules that share the name of the bot owner

* Typo correction

deamon -> daemon

Squashed into a single commit.

* Fix config loading in some edge cases

Fixes sopel-irc#999

Usually where try/except wouldn't catch NoOptionError, happens when running tests in specific environments.

* add groupdict function to triggers (sopel-irc#1061)

* Add IRCv3 extended-join tests

* Add regular join test

* Replace e.message with str(e), e.message has been deprecated since python 2.6

* Fix nickname examples

help_prefix shouldn't replace the first character if it's a nickname example.

* Fix syntax error

* weather: fix location yql query

Resolves sopel-irc#1050 and sopel-irc#1029

* Implement proper extended-join support

* search: tweak ad result blocking

A slight regex change to avoid yahoo ad results from duck duck go if it ends up using the HTML search

* irc: toggle error replies (sopel-irc#1071)

Adds config option to toggle Sopel replying directly to the error source.

* irc: always send exceptions to logger

We don't really need to check if `trigger.sender` is `None` when we're sending to the logger -- as long as `trigger` is defined, we'll be fine.

This just ensures that the `logging_channel` will always get the exception messages. Also pre-formats the message using format because it's more clear what's going on this way.

* Create suppress-warnings.py

Can be dropped into ~/.ipython/profile_default/startup/ to suppress the DeprecationWarnings you get when starting Sopel with iPython enabled

* run_script: if argv is specified, use it

* [announce] Confirm when all announces have been sent (sopel-irc#1044)

* Add global and channel rate limits (sopel-irc#1065)

* Add global and channel rate limits

* Default user rate and compatibility with jenni modules

* Fix critical keyerror bug in rate limiting

* Simplify syntax for @Rate() decorator and update docs

* Don't reset function timer during cooldown

* fix channel time diff variable

* fix indentation in bot.py

* weather: catch empty forecast results (sopel-irc#1077)

e.g. when the user enters a continent for the location.

* irc: treat error in connect as a disconnect (sopel-irc#845)

* irc: test suite enhancement

Comes with some tweaks to support tests
Daemonizes the ping and timeout threads (they should have been in the first place)

* coretasks: prevent KeyError when untracked user leaves

Fixes sopel-irc#1005

* web: fix header bleed (sopel-irc#1092)

Resolves sopel-irc#1091

* seen: be a smart-ass if people ask the bot about itself (sopel-irc#1086)

* module: ignore privilege requirement in privmsg (sopel-irc#1093)

Resolves sopel-irc#1087

* run_script: fix PID file checking logic when the file is empty

This fixes issue sopel-irc#1075

I don't know why the elif explicitly negated the previous codnition, it's
obviously not needed because else if already implies the previous
condition is False.

Also, whoever added the parenthesis there messed up the logic even further,
before they were there, it worked okay, even if the condition was a bit
more verbose than logically needed. Well, that's what you get when you
blindly try to make code conform to PEP8 without actually reading it.

* unicode_info: fallback if input is None (sopel-irc#958)

Resolves sopel-irc#957

* db: raise ValueError in unalias_nick to match documentation (sopel-irc#1102)

Documentation says that a ValueError should be raised if there is not at least one other nick in the group.

Resolves sopel-irc#1101

* Update .gitignore (sopel-irc#1110)

Renamed willie references to sopel
Added .DS_Store ignore

* coretasks: tweak topic tracking (sopel-irc#1111)

Support different implementation of topic update, RPL_TOPIC appears to only be sent to the user who actually updated the topic.

Resolves sopel-irc#1107

* meetbot/url: fix SSLError

The core.verify_ssl was not passed to url.find_title(), resulting in SSL errors on sites with invalid certs when `verify_ssl = False`

Slight refactor of @psachin's original code for backwards compatibility. Resolves sopel-irc#1113

* coretasks: add support for authentication on Quakenet (sopel-irc#1122)

Added the necessary lines for authenticating Sopel with Q. The implementation is almost exactly like AuthServ's. Added Q to core_section along with the other authentication methods, since it is now supported.

* coretasks: remove .lower() on auth_method

auth_method may be None if it's unset, forgot about that case when merging.

Resolves sopel-irc#1124

* setup: tweak requirements

Remove unsupported requires statement in setup.py
Pin requests dependency to 2.10.0 as 2.11.0 introduced a breaking change against the url.py module

* sopel/trigger.py: fix intent_regex

* url: make find_title more robust

Previously, each 512-byte chunk is prone to decoding mishap when a UTF-8 sequence is incomplete. Now we decode all of content at once, ignoring errors.

The old problem appears reliably for pages with many high codepoints:

~~~
<user> http://www3.nhk.or.jp/news/easy/k10010665021000/k10010665021000.html
<bot-old> [ NEWS WEB EASY|������������人���� ] - www3.nhk.or.jp
~~~

* [reddit] Change NSFW tag to SPOILERS for some subs

Hard-coded rather than configured, since in theory the same list should
apply to everyone, and we should merge in new ones. That and effort.

* setup: Be more flexible about requests version

* Release 6.4.0

* Notify if Bugzilla is private (sopel-irc#1115)

Although the primary error no longer exist, but the bot shows nothing if
the bugzilla has invalid alias, invaid id or if it has no valid
permission to access the bug. The logs should show warnings such as,

  WARNING:sopel.modules.bugzilla:Bugzilla error: NotFound <- (Invalid ID)
  WARNING:sopel.modules.bugzilla:Bugzilla error: NotPermitted <- (No permission)

This patch should notify about those errors.

Closes-Bug: sopel-irc#1112

Signed-off-by: Sachin Patil <[email protected]>

* Lint imports (sopel-irc#1085)

After realizing I'd left a dead import in calc.py after removing the .wa command,
I decided to go through and clean up any other imports that didn't appear to be in
use any more.

* Fixed missing `verify_ssl` param

- `verify_ssl` param was missing in few function calls

Closes-Bug: sopel-irc#1118

Signed-off-by: Sachin Patil <[email protected]>

* Add a decorator for url handling

Closes sopel-irc#761. Also add xkcd url handling as a demo.

* Add docstring for url decorator

* Create a gist with the command list

Closes sopel-irc#1105
Closes sopel-irc#1080

* Cache help gist location

* Use custom user agent for title requests

* Add Travis badge

* Increase timeout for DB locked error

This doesn't fix sopel-irc#736, but should at least make it less common

* Fix CI

* Release 6.5.0

* [weather] Use help_prefix in hint text when no location given

* Add a pronouns module

If witch-house/pronoun.is#40 gets merged, it's
probably worth porting to use that, since there are a *lot* of pronoun
sets.

Yes, this should probably support other languages. Sopel's i18n is
horrible and I know it.

* Fix asking for another user's pronouns

* Be a bit less snarky when asked for the bot's pronouns

But only a little

* [etymology] unescape all known HTML entities

Replace bespoke implementation of unescape() with stdlib tools; fix sopel-irc#1153.

* Fixes for pronouns.py

Fixes setpronouns error on lack of trigger.group(2), fixes autocomplete of nicks with a space so that it's stripped out automatically, fixes that it will say the wrong username if you request someone else's pronouns.

* Fixes ConnectionError

url.find_title() throws ConnectionError when hostname/IPaddress is not
readable thereby fails to read title

Sample error
```
15:11:05 psachin:     https://10.65.177.15
15:11:09 BB-8:        requests.exceptions.ConnectionError: HTTPSConnectionPool(host='10.65.177.15', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x7feb41b6a2e8>: Failed to establish a new connection: [Errno 113] No route to host',)) (file "/home/tss/virtualenvs/sopel/lib64/python3.5/site-packages/requests/adapters.py", line 487, in send)
```

* Update web.py

fix blank User-Agent, if a custom user-agent is set for web.get()

* Use a common user-agent to get the proper results from DDG

* Update search.py

Hmm maybe single quotes would be better.

* fix some missed stuff

* Missed one more header copy.  Hopefully last one.

* Fix typo

* Remove duplicate item in triggerable tupe check

* Exclude File links from regex matching

Fix sopel-irc#1182

* Added default value to numbered_result

Added missing default value of "True" for "verify_ssl" parameter on "number_result".

* IP example

Fixed broken IP module example

* Fix API urls for Bank of Canada and BitcoinAverage

* Upper/lowercase shouldn't matter for tell module

* Release 6.5.1

* weather: update from deprecated sopel.web to requests

* safety module - catch exception on urllib/parse

* Fix reddit module

* Actually fix reddit

* [ip] Fix example/test (Google Inc. => Google LLC)

Google changed to an LLC, and updated its AS information, which broke
the test assertion.

Changes cherry-picked from sopel-irc#1250 and reworded.

* Update ignored files for tests

Ignore movie.py module because it requires an API key (and will probably
be moved out of core anyway).

Fix ignores for entry script and ipython module (which were still using
the old "willie" name and therefore weren't ignored). This also allows
removing the command-line ignore from the Travis build script.
kwaaak added a commit to kwaaak/sopel that referenced this pull request Mar 23, 2018
Delete ip.py

ip.py

update2 (#2)

* update movie.py to use omdbapi as imdbapi is non-functional

* minor cosmetic change

* Resolve sopel-irc#926

Also cleaned up dict value retrieval a little, the .get() calls were a bit unnecessary.

* Remove feedparser dep from requirements

* Remove feedparser from RPM spec

* Make CAD currency code case-insensitive

* Resolve sopel-irc#929

Ensures that FilenameAttribute's parse and serialize always have the parameters they need.

* [isup] fix bad indexes preventing bot from recognizing http protocols

`'http://'` is 7 characters long, so `site[:6]` can't ever match it. ditto for `'https://'` and `site[:7]`

* Switch back to get() calls

There's some places where there were already try catch blocks for KeyError, so I left those ones.

* core: Fix issues with reloading folder modules

Closes sopel-irc#899, obviates and closes sopel-irc#932.

* Switch from *args to named args.

* Properly fetch the xml before passing it to xmltodict

I'm a moron.

* Release 6.1.0

* Use flake8 for future checking, and add missing ones

* Fix coding declaration in a few places it was still wrong

Again, fuck windows. sopel-irc#821

* README: Update sopel package location in Arch

* Fix crash during configure

Reported by aam in IRC, http://pastebin.com/0rAQb7Kp

* Fix TypeError: 'NoneType' object is not iterable

Happened when an invalid language hint abbreviation was given.

* Require a valid phrase to translate

* [meetbot] fix crash when starting meeting

Resolves sopel-irc#942

* Release 6.1.1

* Require chanmsg for all adminchannel commands

* Fix sopel-irc#945

We should only get `socket.gaierror` if the user passes an invalid IP or hostname, but it could be in the event that we don't have DNS configured on the machine or a multitude of other things, so we can be somewhat vague in the error message.

* Fix sopel-irc#860

Should resolve the inconsistencies between the implementation of unquote in python 2 and python 3.

* [Formatting]Gray is the same as Grey. Which spelling is correct is a grey area.

* Updated link for help command

was a bad link, updated to correct one

* [countdown] fix remaining time calculation

* [docs] Rename

* Add tracking of users and their accounts

This uses some of the code that @maxpowa wrote for sopel-irc#941, but gives a
somewhat more intuitive API. It also paves the way for potentially
adding direct support for away-notify and metadata-notify.

* rename references to willie in systemd service file

* update project url in comments

* update project url in comment

* update project name in CONTRIBUTING.md (includes log files location, issue tracking url)

* Update github URL to organization, Sopel-IRC, rather than Embolalia

* [dice] Handle negative numbers and uppercase letters.

First, add case-insensitivity to the "d" and "v" letters in the input
regex by capturing [dD] and [vV] in regexes.

Next, make sure we can handle negative numbers properly by capturing
a possible "-" in front of any numbers. If dice_num or dice_type ends
up negative, abort _roll_dice early (dice_type already had a negative
check that wasn't being hit because "-" wasn't captured).

* [dice] Handle negative numbers and uppercase letters.

First, add case-insensitivity to the "d" and "v" letters in the input
regex by capturing [dD] and [vV] in regexes.

Next, make sure we can handle negative numbers properly by capturing
a possible "-" in front of any numbers. If dice_num or dice_type ends
up negative, abort _roll_dice early (dice_type already had a negative
check that wasn't being hit because "-" wasn't captured).

* [dice] Also handle negative drop_lowest values.

* Make BTC currency code case-insensitive

* Address comments on PR sopel-irc#961

This includes being more consistent about using pop rather than del to
prevent key errors, and adding some locking around the privilege related
things

* Add user tracking for RFC WHO replies

* Add away tracking

* Add support for account-tag

* Enable the account-tag capability

* Replace channels list with channels dictionary

Hopefully, nobody else is taking advantage of channels being a list,
rather than a dict. If they are, well, oops.

* Add enumeration of IRC events

Closes sopel-irc#960

* Add cap-notify support

See sopel-irc#971

* coretasks: Replace numeric events with their enums

Also add the missing RPL_WHOSPCRPL to tools.events

* [contrib] rename & edit willie out of contrib

Fixes sopel-irc#963

* Huge cleanup of copyright headers and docstrings

They're still super inconsistent and probably a lot are out of date, but
at least there won't be random copyright info showing up in the docs
anymore. Oh, and my domain and name are correct now, too…

* [currency] Make arguments case-insensitive (close sopel-irc#979)

* Fix URL excludes loading (sopel-irc#959)

setup uses it as a list, and in previous versions of sopel it was a list, but in the UrlSection it's defined as a ValidatedAttribute. This was causing each character in the excludes list to be parsed as a regex exclude. Switching to ListAttribute fixes the issue.

* [find_updates] Fix missing RPL_ prefix

* [bot] Fix misleading message

Coretasks is only one module, so if you loaded it and only one other, you'd get the "Couldn't load any modules" warning, even though there was a module loaded.

* [trigger] IRCv3 server-time support

* [tests] Create more trigger test cases

* [tests] Create module tests

* [tests] DB test cleanup

* [tests] Add formatting tests

* [tests] Fix coding declaration

* [tests] Update .travis.yml

* [translate] Fix bad test

* Support using account name for auth

* [tests] Remove pep8 dev-requirement

Also fixed the critical error where py.test thought sopel.py (the entry script) was the sopel package.

* [pep8] Minor clean up to conform with pep8

* [web] Ensure ca_certs is defined

When running tests it's not defined at all, because the only place defining it was the config loader code. Now it's defined, but will still fail out since it's not a valid CA certificate file.

* [pep8] Final pep8 run

Everything should now conform to pep8 and pass flake8.

* [tests] Improve trigger coverage

* [reddit] Prevent specific commands in PM

Resolves sopel-irc#789

* Documentation for target types

* docs/core: Shift around functions to make autodoccing easier

* docs: Start on a major cleanup of documentation

* Make _ssl_recv always return bytes

_ssl_recv returned empty strings instead of the empty bytes object if
the socket was closed or upon ENOENT.

This lead to exceptions when running sopel with python3 because
asynchat.handle_read expects byte objects.

This commit fixes sopel-irc#937.

* docs: Majorly overhaul organization and format

* Release 6.2.0

* Release 6.2.1

* dice: Allow comma delimiter

Closes sopel-irc#998

* adminchannel: Remove totally useless commands

Also add error messages to the somewhat useful ones

* Make it a bit harder to run into the LC_ALL thing

This behavior is stupid. Respecting LC_ALL, or anything else for that
matter, over the encoding fucking noted in the fucking file is a bad
decision, and someone should feel bad. I don't know why it makes things
break in the specific and bizarre way it does, but it does, and there's
no possible good reason for it.

Closes sopel-irc#984. Fuck.

* Escape nick before replacing it in regex

Resolves sopel-irc#1004

* [weather] Fix YQL woeid lookup

Handles an edge case that neither of the PRs handled

Fixes sopel-irc#1006
Closes sopel-irc#1007, sopel-irc#1012

* CONTRIBUTING: Let's drop the [brackets] thing and do what everyone else does

No point in being different from any other FOSS project out there.

* contrib/rpm: willie->sopel

* CONTRIBUTING: Update coding/future import guidelines

`# coding=utf-8` is now the standard in Sopel & supports windows. The future import now conforms with the flake8 future imports (also conforms with @embolalia's formatting passes a bit back)

* web: make web.py into a requests comaptibility layer

Since web is deprecated and everyone should switch to requests,
the first step is to make web.py a requests comaptibility layer.

When web.py was new, requests was not ripe enough to use in Sopel.

But now it's time to switch to requests like the rest of the python
echo-system. web.py is no more.

* find_updates: switch from web.py to requests

* web: Fix typo

This is why we have flake8 ;)

* url: port to requests

Now that web.py is deprecated, we can port url.py to requests.

Originally from pull request sopel-irc#988, committed here with minor bugfixes
and modified commit message.

* translate: port to requests

* movie: port to requests

* xkcd: port to requests

* Track the channel topic

* Improve locale stupidity checking

Thanks to @elad661's comment on b73fc6a

* url: handle capitalized URLs

Trigger rules are case-insensitive regexes, so the auto title responder
will be triggered even for capitalized URLs such as "Http://google.com"
(which can happen, for example, when a mobile device attempts to
auto-capitalize the beginning of a sentence).  Match URLs case
insensitively for title lookup purposes and add error handling in case
no URLs could be extracted from the match.

* core: Tweak rate limiting to be more effective

This doesn't solve the issue, but it should make it slightly less
critical. sopel-irc#952

* reddit: fetch posts by submission_id

Previously, the reddit module fetched posts by the full URL of the post.
This led to RedirectExceptions in some cases, for example when someone
links a naked reddit.com URL instead of www.reddit.com.  Instead, match
only the post ID and pass it to get_submission.

* Release 6.3.0

* Don't warn about non-UTF8 locales when running on Python 2

Python 2 doesn't change string behavior according to the locale env,
that's a py3 specific weirdness.

Also, reword the error message to better explain the issue to the user.

* Fix apparent typo in host_blocks initialization

* Added missing import to xkcd.py

* core: Fix print in handle_error when reaching the exception limit

Fixes issue sopel-irc#1025

* trigger: Fix target for QUIT events

This was fun to debug! Basically, Soepl encountered an exception
when removing unknown users when they QUIT. While this shouldn't
happen, it should still be handled gracefully.

Since it was an exception, Sopel's response was to try and send
the exception line to the channel (sender) the message came from, but
QUIT events don't come from channels (or users by PRIVMSG)!

Since QUIT was not special-cased, the naive assumption that
the first argument is the "sender" was used, and when Sopel
tried to send the exception line to the "sender", and the sender
had a space in it, this would lead to spam if a user exists with a nick
that is identical to the first word in the QUIT message. Ouch.

The fix special-cases QUIT in pretrigger to never have a "sender".
I also added a test to make sure we parse QUIT correctly.

This solves issue sopel-irc#1026.

* core: Never try to send an exception line when sender is None

Just in case.

* Make sure we're working with UTF8 string

Depending on the URL, response.iter_content() in Python 2/3 will return either:
- `type 'unicode'`/`class 'str'` (for "plain" HTML)
- `type 'str'`/`class 'bytes'` (for binary streams, like file URLs)

To distinguish between the two situations we're checking if we got string or
bytes, and proceed accordingly.

This also fixes sopel-irc#1021

* [doc] unify grammatical number of `@commands` example

It seems there is no `@command` decorator.

At any rate, examples for `@commands` should use the same (and correct) grammatical number.

* Release 6.3.1

* FIX: Private BZ's - AttributeError: 'NoneType'

* [calc] Remove .wa, as API now requires a key

Will be moved to an external module that supports the new API

* search: Remove ad URL results

DDG changed their HTML output slightly and that threw us off, this *should* fix the r.search.yahoo.com URLs that .g was returning.

* Fix issue sopel-irc#1048

* fix .set command for non filename attributes

* Fix loading/reloading modules that share the name of the bot owner

* Typo correction

deamon -> daemon

Squashed into a single commit.

* Fix config loading in some edge cases

Fixes sopel-irc#999

Usually where try/except wouldn't catch NoOptionError, happens when running tests in specific environments.

* add groupdict function to triggers (sopel-irc#1061)

* Add IRCv3 extended-join tests

* Add regular join test

* Replace e.message with str(e), e.message has been deprecated since python 2.6

* Fix nickname examples

help_prefix shouldn't replace the first character if it's a nickname example.

* Fix syntax error

* weather: fix location yql query

Resolves sopel-irc#1050 and sopel-irc#1029

* Implement proper extended-join support

* search: tweak ad result blocking

A slight regex change to avoid yahoo ad results from duck duck go if it ends up using the HTML search

* irc: toggle error replies (sopel-irc#1071)

Adds config option to toggle Sopel replying directly to the error source.

* irc: always send exceptions to logger

We don't really need to check if `trigger.sender` is `None` when we're sending to the logger -- as long as `trigger` is defined, we'll be fine.

This just ensures that the `logging_channel` will always get the exception messages. Also pre-formats the message using format because it's more clear what's going on this way.

* Create suppress-warnings.py

Can be dropped into ~/.ipython/profile_default/startup/ to suppress the DeprecationWarnings you get when starting Sopel with iPython enabled

* run_script: if argv is specified, use it

* [announce] Confirm when all announces have been sent (sopel-irc#1044)

* Add global and channel rate limits (sopel-irc#1065)

* Add global and channel rate limits

* Default user rate and compatibility with jenni modules

* Fix critical keyerror bug in rate limiting

* Simplify syntax for @Rate() decorator and update docs

* Don't reset function timer during cooldown

* fix channel time diff variable

* fix indentation in bot.py

* weather: catch empty forecast results (sopel-irc#1077)

e.g. when the user enters a continent for the location.

* irc: treat error in connect as a disconnect (sopel-irc#845)

* irc: test suite enhancement

Comes with some tweaks to support tests
Daemonizes the ping and timeout threads (they should have been in the first place)

* coretasks: prevent KeyError when untracked user leaves

Fixes sopel-irc#1005

* web: fix header bleed (sopel-irc#1092)

Resolves sopel-irc#1091

* seen: be a smart-ass if people ask the bot about itself (sopel-irc#1086)

* module: ignore privilege requirement in privmsg (sopel-irc#1093)

Resolves sopel-irc#1087

* run_script: fix PID file checking logic when the file is empty

This fixes issue sopel-irc#1075

I don't know why the elif explicitly negated the previous codnition, it's
obviously not needed because else if already implies the previous
condition is False.

Also, whoever added the parenthesis there messed up the logic even further,
before they were there, it worked okay, even if the condition was a bit
more verbose than logically needed. Well, that's what you get when you
blindly try to make code conform to PEP8 without actually reading it.

* unicode_info: fallback if input is None (sopel-irc#958)

Resolves sopel-irc#957

* db: raise ValueError in unalias_nick to match documentation (sopel-irc#1102)

Documentation says that a ValueError should be raised if there is not at least one other nick in the group.

Resolves sopel-irc#1101

* Update .gitignore (sopel-irc#1110)

Renamed willie references to sopel
Added .DS_Store ignore

* coretasks: tweak topic tracking (sopel-irc#1111)

Support different implementation of topic update, RPL_TOPIC appears to only be sent to the user who actually updated the topic.

Resolves sopel-irc#1107

* meetbot/url: fix SSLError

The core.verify_ssl was not passed to url.find_title(), resulting in SSL errors on sites with invalid certs when `verify_ssl = False`

Slight refactor of @psachin's original code for backwards compatibility. Resolves sopel-irc#1113

* coretasks: add support for authentication on Quakenet (sopel-irc#1122)

Added the necessary lines for authenticating Sopel with Q. The implementation is almost exactly like AuthServ's. Added Q to core_section along with the other authentication methods, since it is now supported.

* coretasks: remove .lower() on auth_method

auth_method may be None if it's unset, forgot about that case when merging.

Resolves sopel-irc#1124

* setup: tweak requirements

Remove unsupported requires statement in setup.py
Pin requests dependency to 2.10.0 as 2.11.0 introduced a breaking change against the url.py module

* sopel/trigger.py: fix intent_regex

* url: make find_title more robust

Previously, each 512-byte chunk is prone to decoding mishap when a UTF-8 sequence is incomplete. Now we decode all of content at once, ignoring errors.

The old problem appears reliably for pages with many high codepoints:

~~~
<user> http://www3.nhk.or.jp/news/easy/k10010665021000/k10010665021000.html
<bot-old> [ NEWS WEB EASY|������������人���� ] - www3.nhk.or.jp
~~~

* [reddit] Change NSFW tag to SPOILERS for some subs

Hard-coded rather than configured, since in theory the same list should
apply to everyone, and we should merge in new ones. That and effort.

* setup: Be more flexible about requests version

* Release 6.4.0

* Notify if Bugzilla is private (sopel-irc#1115)

Although the primary error no longer exist, but the bot shows nothing if
the bugzilla has invalid alias, invaid id or if it has no valid
permission to access the bug. The logs should show warnings such as,

  WARNING:sopel.modules.bugzilla:Bugzilla error: NotFound <- (Invalid ID)
  WARNING:sopel.modules.bugzilla:Bugzilla error: NotPermitted <- (No permission)

This patch should notify about those errors.

Closes-Bug: sopel-irc#1112

Signed-off-by: Sachin Patil <[email protected]>

* Lint imports (sopel-irc#1085)

After realizing I'd left a dead import in calc.py after removing the .wa command,
I decided to go through and clean up any other imports that didn't appear to be in
use any more.

* Fixed missing `verify_ssl` param

- `verify_ssl` param was missing in few function calls

Closes-Bug: sopel-irc#1118

Signed-off-by: Sachin Patil <[email protected]>

* Add a decorator for url handling

Closes sopel-irc#761. Also add xkcd url handling as a demo.

* Add docstring for url decorator

* Create a gist with the command list

Closes sopel-irc#1105
Closes sopel-irc#1080

* Cache help gist location

* Use custom user agent for title requests

* Add Travis badge

* Increase timeout for DB locked error

This doesn't fix sopel-irc#736, but should at least make it less common

* Fix CI

* Release 6.5.0

* [weather] Use help_prefix in hint text when no location given

* Add a pronouns module

If witch-house/pronoun.is#40 gets merged, it's
probably worth porting to use that, since there are a *lot* of pronoun
sets.

Yes, this should probably support other languages. Sopel's i18n is
horrible and I know it.

* Fix asking for another user's pronouns

* Be a bit less snarky when asked for the bot's pronouns

But only a little

* [etymology] unescape all known HTML entities

Replace bespoke implementation of unescape() with stdlib tools; fix sopel-irc#1153.

* Fixes for pronouns.py

Fixes setpronouns error on lack of trigger.group(2), fixes autocomplete of nicks with a space so that it's stripped out automatically, fixes that it will say the wrong username if you request someone else's pronouns.

* Fixes ConnectionError

url.find_title() throws ConnectionError when hostname/IPaddress is not
readable thereby fails to read title

Sample error
```
15:11:05 psachin:     https://10.65.177.15
15:11:09 BB-8:        requests.exceptions.ConnectionError: HTTPSConnectionPool(host='10.65.177.15', port=443): Max retries exceeded with url: / (Caused by NewConnectionError('<requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x7feb41b6a2e8>: Failed to establish a new connection: [Errno 113] No route to host',)) (file "/home/tss/virtualenvs/sopel/lib64/python3.5/site-packages/requests/adapters.py", line 487, in send)
```

* Update web.py

fix blank User-Agent, if a custom user-agent is set for web.get()

* Use a common user-agent to get the proper results from DDG

* Update search.py

Hmm maybe single quotes would be better.

* fix some missed stuff

* Missed one more header copy.  Hopefully last one.

* Fix typo

* Remove duplicate item in triggerable tupe check

* Exclude File links from regex matching

Fix sopel-irc#1182

* Added default value to numbered_result

Added missing default value of "True" for "verify_ssl" parameter on "number_result".

* IP example

Fixed broken IP module example

* Fix API urls for Bank of Canada and BitcoinAverage

* Upper/lowercase shouldn't matter for tell module

* Release 6.5.1

* weather: update from deprecated sopel.web to requests

* safety module - catch exception on urllib/parse

* Fix reddit module

* Actually fix reddit

* [ip] Fix example/test (Google Inc. => Google LLC)

Google changed to an LLC, and updated its AS information, which broke
the test assertion.

Changes cherry-picked from sopel-irc#1250 and reworded.

* Update ignored files for tests

Ignore movie.py module because it requires an API key (and will probably
be moved out of core anyway).

Fix ignores for entry script and ipython module (which were still using
the old "willie" name and therefore weren't ignored). This also allows
removing the command-line ignore from the Travis build script.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants