Mikhail Korobov

Login: kmike

Company: @ScrapingHub

Location: Russia, Ekaterinburg

Bio: null

Blog: http://kmike.ru/pages/about/

Blog: http://kmike.ru/pages/about/

Member of

  1. conda-forge
  2. Natural Language Toolkit
  3. OpenCorpora
  4. pytries
  5. Scrapinghub
  6. Scrapy Plugins
  7. Scrapy project
  8. null
  9. null

Repositories

backbone
Give your JS App some Backbone with Models, Views, Collections, and Events
behavior
Auto-instantiates widgets/classes based on parsed, declarative HTML.
brukva
Asynchronous Redis client that works within Tornado IO loop.
casperjs
Navigation scripting & testing utility for PhantomJS and SlimerJS
celery
Distributed Task Queue
c-hat-trie
An efficient trie implementation.
clientcide
The Clientcide Javascript Libraries
cookiecutter
A command-line utility that creates projects from cookiecutters (project templates). E.g. Python package projects, jQuery plugin projects.
crfsuite
CRFsuite: a fast implementation of Conditional Random Fields (CRFs)
cssselect
CSS Selectors for Python
cython
A Python to C compiler
dialog2017
null
dirbot
Scrapy project to scrape public web directories (educational)
django
The Web framework for perfectionists with deadlines.
django-admin-decorators
Extra decorators for django admin
django-admin-user-stats
This app provides django-admin-tools dashboard modules with user registration stats/charts.
django-batch-select
batch select many-to-many and one-to-many fields (to help avoid n+1 query problem)
django-celery
Celery integration for Django
django-colorful
extension to the Django web framework that provides database and form color fields
django-coverage
Yet another django test coverage app with nice custom html reports. The official git mirror.
django-debug-toolbar
A configurable set of panels that display various debug information about the current request/response.
django-eml-email-backend
Django email backend that saves emails to .eml files. Such files can be opened directly in Outlook or Mail.app.
Django-facebook
Facebook open graph api implementation using the Django web framework in python
django-flatblocks
django-chunks + headerfield + variable chunknames + "inclusion tag" == django-flatblocks
django-generic-images
This app provides image model (with useful managers, fields, utility methods and advanced admin image uploader) that can be attached to any other Django model using generic relations.
django-mailru-money
Django app for money.mail.ru. It was not tested in production!
django-mootools-behavior
Utilities for https://github.com/anutron/behavior integration with django.
django-pony
A pony for your django project.
django-profiling-dashboard
Dashboard with various profiling tools suitable for live servers
django-query-exchange
Django application for handling GET query params for url creation
django-registration-facebook-backend
A Facebook Connect backend for use with django-registration
django-robokassa
????????????? ????????????! ?????? ???? ?? ??????????????. ?????????? ??? ?????????? ????????? ??????? ROBOKASSA ? ??????? django. ???????? - MIT.
django-salmonella
A raw_id_fields widget replacement that handles display of an object's string value on change and can be overridden via a template.
django-sentry
Previously django-db-log, provides real-time logging for Django exceptions
django-seo
Provides a set of tools for managing Search Engine Optimisation (SEO) for Django sites.
django-tastypie
Creating delicious APIs for Django apps since 2010. Beta-quality. v0.9.8
django-timelog
Performance logging middlware and analysis tools for Django
django-vkontakte-iframe
Django app for developing vk.com (aka vkontakte.ru largest, Russian social network) iframe applications. Handles user authentication and registration. Official git mirror.
django-whatever
Unobtrusive test models creation for django
doctest2
Improvements to the doctest standard library
Dolt
A dumb little wrapper around RESTful interfaces
easy-thumbnails
Easy thumbnails for Django
elasticsearch
Open Source, Distributed, RESTful Search Engine
elasticsearch-py
Official Python low-level client for Elasticsearch.
fabric
Simple, Pythonic remote execution and deployment.
fabric-taskset
Expose methods as Fabric tasks
fabtest
Test Fabric scripts on VirtualBox VMs
factory_boy
A test fixtures replacement for Python based on thoughtbot's factory_girl for Ruby.
feincms
A Django-based CMS with a focus on extensibility and concise code
FinnPos
CRF-based Morphological Tagging and Lemmatization
funny-codes
Generate randoms strings of a given pattern
GearsUploader
GearsUploader is a Google Gears-based unobtrusive file-input replacement Mootools classes with animated progress bar. They features client-side image resizing, multiple file uploads, complete emulation of traditional form submissions (no server-side changes are required) and Django formset submissions.
gensim
Topic Modelling for Humans
graphite-web
null
greatape
Client library for the MailChimp API v1.2.
harviewer
HAR Viewer is a web application that allows visualizing HTTP Archive logs (HARs)
hdbscan
A high performance implementation of HDBSCAN clustering.
high_performance_python
Code for the book "High Performance Python" by Micha Gorelick and Ian Ozsvald with OReilly
imobis
Python interface to http://sms-manager.ru/ (aka http://www.imobis.ru/ ).
ipython
Official repository for IPython itself. Other repos in the IPython organization contain things like the website, documentation builds, etc.
jogging
Easier Django logging!
jsonview
A Firefox extension that helps you view JSON documents in the browser.
jupyter_kernel_test
A tool for testing Jupyter kernels
keras
Deep Learning library for Python. Convnets, recurrent neural networks, and more. Runs on Theano and TensorFlow.
kivy
Open source software library for creating NUI applications, running on Windows, Linux, MacOSX, Android
languagetool
Style and Grammar Checker for 25+ Languages
Lasagne
Lightweight library to build and train neural networks in Theano
LightGBM
A fast, distributed, high performance gradient boosting (GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks. It is under the umbrella of the DMTK(http://github.com/microsoft/dmtk) project of Microsoft.
lightning
Large-scale linear classification, regression and ranking in Python
LT2OpenCorpora
Python script to convert ukrainian morphological dictionary to OpenCorpora format. Script runs well under PyPy and also collects some stats/insights/anomalies in the dicts. Use on your own risk.
lupa
Lua in Python
matplotlib
matplotlib: plotting with Python
memex-cdr
This repository hosts code and schema information related to the Memex Crawl Data Repository (CDR)
memory_profiler
Monitor Memory usage of Python code
microcorpus
????????? ?????? ?????????????? ??????????? ??????? + ???-????????? ??? ????????
mitmproxy
An interactive SSL-capable intercepting HTTP proxy for penetration testers and software developers
mootools-meio-mask
A mootools plugin for creating masked input texts
mootools-more
MooTools Plugins and Enhancements Repository
morphine
[experiment] CRF-based disambiguation engine for pymorphy2
nltk
For patches to NLTK
nltk-coverage-data
temporary repository with nltk test coverage data
nltk_data
NLTK Data
nodejs
Node.js Dockerfile for trusted automated Docker builds.
opencorpora-tools
Python interface to http://opencorpora.org/
pelican
Static blog generator in python, using ReST syntax
Pillow
Pillow is the "friendly" PIL fork
pip
pip installs packages. Python packages. An easy_install replacement
port-for
`port-for` is a command-line utility and a python library that helps with local TCP ports managment. `port-for ` script finds an unused port and associates it with ``. Subsequent calls will return the same port number.
portia
Visual scraping for Scrapy
psdparse
A python utility to parses various structures inside an Adobe Photoshop(TM) PSD format file.
pycon-speakers
null
pyCRFsuite
null
py-leveldb
leveldb bindings for python
pymorphy
[UNSUPPORTED] - please use https://github.com/kmike/pymorphy2. Russian and English morphology analyser (POS tagger + inflection engine) written in python. It is based on dictionaries and research from http://aot.ru. Docs (Russian): http://pymorphy.rtfd.org/
pymorphy2
Morphological analyzer / inflection engine for Russian and Ukrainian languages.
pymorphy2-dicts
Scripts for updating pymorphy2 dictionaries
pyre2
Python wrapper for RE2
pystatsd
A Python client for statsd
python-faker
Generate placeholder data. Port of Ruby port of Perl module.
python-wapiti
Python bindings for libwapiti
pytils
Russian-specific string utils
pytorch
Tensors and Dynamic neural networks in Python with strong GPU acceleration
qpr-summer-2016-eval
null
qtreactor
Twisted Qt4 Integration
refluxjs
A simple library for uni-directional dataflow application architecture inspired by ReactJS Flux
ruscorpora-tools
Python interface to a free corpus subset available at http://ruscorpora.ru
russian-tagsets
Russian morphological tagset converters library.
sailthru-python-client
Python client for Sailthru
scikit-learn
scikit-learn: machine learning in Python
scrapely
A pure-python HTML screen-scraping library
scrapy
Scrapy, a fast high-level screen scraping and web crawling framework for Python.
scrapyd
A service daemon to run Scrapy spiders
scrapy-djangoitem
Scrapy extension to write scraped items using Django models
scrapylib
Collection of Scrapy extensions, middlewares, pipelines & helper functions
segtok
A rule-based sentence segmenter and a word tokenizer using orthographic features.
seqlearn
Sequence learning toolkit for Python
shub
Scrapinghub Command Line Client
slybot
Slybot web crawler, replaced by Portia
Socket.IO
Sockets for the rest of us
spaCy
? Industrial-strength Natural Language Processing (NLP) with Python and Cython
sphinx-autobuild
Watch a Sphinx directory and rebuild the documentation when a change is detected. Also includes a livereload enabled web server.
splash
Javascript rendering service
StarCluster
StarCluster is an open source cluster-computing toolkit for Amazon's Elastic Compute Cloud (EC2).
stupid-python-tricks
Stupid Python tricks.
templated-emails
A simple app (that works similar to django-notification) that allows you to send emails by specifying a short.txt (subject), email.txt (plain text), and email.html (html email, optional) in a folder. When you send the email you only have to specify the folder and the context.
text-unidecode
The most basic Text::Unidecode port (licensed under Artistic License)
Theano
Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It can use GPUs and perform efficient symbolic differentiation.
tornadio
Socket.io server implementation on top of Tornado framework
tornado
Tornado is an open source version of the scalable, non-blocking web server and tools that power FriendFeed.
tornado-es
A tornado-powered python library that provides asynchronous access to elasticsearch
tornado-slacker
(experiment!) This package provides an easy API for moving the work out of the tornado process / event loop.
tqdm
Add a progress meter to your loops in a second
UnbalancedDataset
Python module to perform under sampling and over sampling with various techniques.
vision
Datasets, Transforms and Models specific to Computer Vision
vkontakte
vkontakte.ru python API wrapper
vowpal_wabbit
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.
w3lib
Python library of web-related functions
Wapiti
A simple and fast discriminative sequence labeling toolkit ( http://wapiti.limsi.fr )
WebAnnotator
WebAnnotator is a tool for annotating Web pages. WebAnnotator is implemented as a Firefox extension (https://addons.mozilla.org/en-US/firefox/addon/webannotator/), allowing annotation of both offline and inline pages. The HTML rendering is fully preserved and all annotations consist in new HTML spans with specific styles. WebAnnotator provides an easy and general-purpose framework and is made available under CeCILL free license (close to GNU GPL ? see the license text), so that use and further contributions are made simple. All parts of an HTML document can be annotated: text, images, videos, tables, menus, etc. The annotations are created by simply selecting a part of the document and clicking on the relevant type and subtypes. The annotated elements are then highlighted in a specific color. Annotation schemas can be defined by the user by creating a simple DTD representing the types and subtypes that must be highlighted. Finally, annotations can be saved (HTML with highlighted parts of documents) or exported (in a machine-readable format).
webassets
Asset management for Python web development.
webstruct
Learning the structure of the web
webtest
Wraps any WSGI application and makes it easy to send test requests to that application, without starting up an HTTP server.
yandex-maps
[ ?? ??????????????. ?????? ?? kmike84@gmail.com, ???? ?????? ?????? ] ?????????? ??? ?????? ? API ??????-????.

Commits To

RepositoryMost Recent Commit# Commits
rkern/line_profiler2014-04-21 17:57:52.04


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.