Roujack/mathAI

一个拍照做题程序。输入一张包含数学计算题的图片,输出识别出的数学计算式以及计算结果。This is a mathematic expression recognition project.

mathAI

一个拍照做题程序。输入一张包含数学计算题的图片,输出识别出的数学计算式以及计算结果。
请查看系统文档说明来运行程序。注意,这是一个半开源的项目,目前上传的版本只能处理简单的一维加减乘除算术表达式(如果想要识别更加复杂的表达式,可以参考数学公式识别的论文)。可以参考的代码是前面字符识别部分以及整个算法处理框架。

整个程序使用python实现,具体处理流程包括了图像预处理、字符识别、数学公式识别、数学公式语义理解、结果输出。

本程序使用opencv对输入的图像进行预处理,并将字符裁剪出来再归一化成固定大小的矩阵。我在TensorFlow上实现了一个lenet5
的卷积神经网络用来识别数学字符,训练使用CHROME数据集。对于数学公式的识别,主要是将识别出的独立的字符组织成计算机能够
理解的数学公式(这里的数学公式就是纯字符的可求解的数学计算题)。大概的方法是使用编译原理的算符优先法和递归下降法进行实现。
然后根据属性文法的值传递思想,将数学公式的值计算出来。最后使用python的matlibplot库把计算过程和答案打印出来。

优点:这是一整套拍照做题的算法框架,同时能够处理多种多样的计算题,目前市面上还没有看到实现。OCR技术如此成熟的今天字符识别
已经不算有挑战的东西了。
缺点:字符空间关系判断只用了人类启发式规则,图像预处理不够鲁棒,数学公式的结构识别算法不够完美(可以考虑使用二维文法来做)。
系统还有很大的提升空间。

这本来是一个很有野心的project,因为它试图解决所有的我们遇到的数学题。更一般的是,我试图实现一个演算系统,
在这个系统里,输入一些规则(公理),以及已知条件,我们希望它能够推出一些结论出来。但是我发现这是一个很难的问题。
最近的学习结论告诉我,一方面循环不变式的寻找无法自动化,另一方面演算的时间和空间复杂度太高(可以参考prolog语言的实现)。

什么是智能?是现在的机器学习吗?我觉得不像。SVM是最大化函数间隔的算法,神经网络是寻找损失函数局部最优解的过程。这些技术只是数学层面的优化。
还是某种推理能力?我不知道。如果给它一些演算规则,和一些具有现实意义的符号,再加上一些公理,它能够演算出一些结论,这个结论具有现实意义,
是不是就意味着智能?我觉着是的,不过很难实现。据说吴文俊采用了自动化的方法证明了几何定理(吴方法)。

我们的意识或者思维到底是什么东西?信息在大脑里面到底是如何表示的?我们为什么具有泛华的学习能力?一方面脑认知科学在求索,一方面计算科学也在求索。
我希望意识是能够认识物质的,并在最终达到越来越完美的过程。这才是科研的意义。我愿一直去寻找真理。

oarriaga/face_classification

Real-time face detection and emotion/gender classification using fer2013/imdb datasets with a keras CNN model and openCV.

一个实时人脸检测,表情/性别分析

face_classification

Face classification and detection.

Real-time face detection and emotion/gender classification using fer2013/IMDB datasets with a keras CNN model and openCV.

  • IMDB gender classification test accuracy: 96%.
  • fer2013 emotion classification test accuracy: 66%.

For more information please consult the publication

Emotion/gender examples:

Guided back-prop

Real-time demo:

B-IT-BOTS robotics team 😃

Instructions

Run real-time emotion demo:

python3 video_emotion_color_demo.py

Run real-time guided back-prop demo:

python3 image_gradcam_demo.py

Make inference on single images:

python3 image_emotion_gender_demo.py <image_path>

e.g.

python3 image_emotion_gender_demo.py …/images/test_image.jpg

Running with Docker

With a few steps one can get its own face classification and detection running. Follow the commands below:

  • docker pull ekholabs/face-classifier
  • docker run -d -p 8084:8084 --name=face-classifier ekholabs/face-classifier
  • curl -v -F image=@[path_to_image] http://localhost:8084/classifyImage > image.png
To train previous/new models for emotion classification:
  • Download the fer2013.tar.gz file from here

  • Move the downloaded file to the datasets directory inside this repository.

  • Untar the file:

tar -xzf fer2013.tar

  • Run the train_emotion_classification.py file

python3 train_emotion_classifier.py

To train previous/new models for gender classification:
  • Download the imdb_crop.tar file from here (It’s the 7GB button with the tittle Download faces only).

  • Move the downloaded file to the datasets directory inside this repository.

  • Untar the file:

tar -xfv imdb_crop.tar

  • Run the train_gender_classification.py file

python3 train_gender_classifier.py

Fingers-Detection-using-OpenCV-and-Python

A simple Fingers Detection (or Gesture Recognition) using OpenCV and Python with background substraction 简单手势识别

Fingers-Detection-using-OpenCV-and-Python

for people using python2 and opencv2, please check out the lzane:py2_opencv2 branch.

for people using opencv4, please change line 96 in the new.py to contours, hierarchy = cv2.findContours(thresh1, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE) according to the opencv api change.

Environment

  • OS: MacOS El Capitan
  • Platform: Python 3
  • Librarys:
    • OpenCV 3
    • appscript

Demo Videos

How to run it?

  • run it in python
  • press 'b' to capture the background model (Remember to move your hand out of the blue rectangle)
  • press 'r' to reset the backgroud model
  • press 'ESC' to exit

Process

Capture original image

Capture video from camera and pick up a frame.

Alt text

Capture background model & Background subtraction

Use background subtraction method called Gaussian Mixture-based Background/Foreground Segmentation Algorithm to subtract background.

For more information about the method, check Zivkovic2004

Here I use the OpenCV’s built-in function BackgroundSubtractorMOG2 to subtract background.

1
bgModel = cv2.BackgroundSubtractorMOG2(0, bgSubThreshold)

Build a background subtractor model

1
fgmask = bgModel.apply(frame)

Apply the model to a frame

1
res = cv2.bitwise_and(frame, frame, mask=fgmask)

Get the foreground(hand) image

Alt text

Gaussian blur & Threshold

1
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

First convert the image to gray scale.

1
blur = cv2.GaussianBlur(gray, (blurValue, blurValue), 0)

By Gaussian blurring, we create smooth transition from one color to another and reduce the edge content.

Alt text

1
ret, thresh = cv2.threshold(blur, threshold, 255, cv2.THRESH_BINARY)

We use thresholding to create binary images from grayscale images.

Alt text

Contour & Hull & Convexity

We now need to find out the hand contour from the binary image we created before and detect fingers (or in other words, recognize gestures)

1
contours, hierarchy = cv2.findContours(thresh1, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)

This function will find all the contours from the binary image. We need to get the biggest contours (our hand) based on their area since we can assume that our hand will be the biggest contour in this situation. (it’s obvious)

After picking up our hand, we can create its hull and detect the defects by calling :

1
2
hull = cv2.convexHull(res)
defects = cv2.convexityDefects(res, hull)

Alt text

Now we have the number of fingers. How to use this information? It’s based on your imagination…

I add in a keyboard simulation package named appscript as interface to control Chrome’s dinosaur game.

Alt text


References & Tutorials

  1. OpenCV documentation:
    http://docs.opencv.org/2.4.13/
  2. Opencv python hand gesture recognition:
    http://creat-tabu.blogspot.com/2013/08/opencv-python-hand-gesture-recognition.html
  3. Mahaveerverma’s hand gesture recognition project:
    hand-gesture-recognition-opencv

Kr1s77/flask-video-streaming-recorder

opencv+flask 家庭监控系统(surveillance_system)

Raspberry Pi flask+opencv Surveillance System


Master


"Did you know all your doors were locked?" - Riddick (The Chronicles of Riddick)


Branch Pull Requests License

Created by CriseLYJ


Installing

🐍First you should install Python3.x on your Raspberry Pi

$ sudo apt-get update
$ sudo apt-get upgrade

  • Install pythondependent environment
  • install python Dependent environment

$ sudo apt-get install build-essential libsqlite3-dev sqlite3 bzip2 libbz2-dev

  • Download the python3.6 version source and extract it
  • Download the python version 3.6 source code and decompress it
  $ wget https://www.python.org/ftp/python/3.6.1/Python-3.6.1.tgz
  $ tar zxvf Python-3.6.1.tgz
  • Compilation and installation
  $ cd Python-3.6.1
  $ sudo ./configure
  $ sudo make
  $ sudo make install
  • Check installation

$ ls -al /usr/local/bin/python*

Next install the module

  • Install flask

$ pip3 install flask==0.10.1

  • Install opencv
  • install opencv

$ pip3 install opencv_python

Running the tests

  • Download all files to run
  • run main.py

$ python3 main.py -p 0.0.0.0
当然你也可以使用Gunicorn来当做你的多线程服务器

  • 2019.2.21 update

  • Increased login, a simple login interface, does not need a database

  • Test account

1
2
Username:  admin
Password: admin
  • 2019.3.4 update

  • Add multi-threading and recording downloads

  • Support multi-device access, logout login is normal

  • 2019.3.14 update

  • 现在的目录结构是这个样子

  • 抽取了代码,进行了优化,就是这样目录看起来会很多

  • Added a beautiful login interface
    Alt text

  • Optimization homepage

Alt text

  • Add video recording and download capabilities
  • Realized the high performance, using the yield generator, and multi-threading, silky smooth!

Author

  • Crise LYJ

Acknowledgments

  • Thanks for all!

  • Have a good time!

Alt text

nuno-faria/tiler

👷 Build images with images

nuno-faria/tiler

title

👷 Build images with images.

About

Tiler is a tool to create an image using all kinds of other smaller images (tiles). It is different from other mosaic tools since it can adapt to tiles with multiple shapes and sizes (i.e. not limited to squares).

An image can be built out of circles, lines, waves, cross stitches, legos, minecraft blocks, paper clips, letters, … The possibilities are endless!

Installation

  • Clone the repo: git clone https://github.com/nuno-faria/tiler.git;
  • Install Python 3;
  • Install pip (optional, to install the dependencies);
  • Install dependencies: pip install -r requirements.txt

Usage

  • Make a folder with the tiles (and only the tiles) to build the image;
    • The script gen_tiles.py can help in this task; it builds tiles with multiple colors based on the source tile (note: its recommended for the source file to have an RGB color of (240,240,240)). It is used as python gen_tiles.py path/to/image and creates a folder with a ‘gen_’ prefix in the same path as the base image.
  • Run python tiler.py path/to/image path/to/tiles_folder/.

Configuration

All configurations can be changed in the conf.py file.

gen_tiles.py

  • DEPTH - number of divisions in each color channel (ex: DEPTH = 4 -> 4 * 4 * 4 = 64 colors);
  • ROTATIONS - list of rotations, in degrees, to apply over the original image (ex: [0, 90]).

tiler.py

  • COLOR_DEPTH - number of divisions in each color channel (ex: COLOR_DEPTH = 4 -> 4 * 4 * 4 = 64 colors);
  • RESIZING_SCALES - scale to apply to each tile (ex: [1, 0.75, 0.5, 0.25]);
  • PIXEL_SHIFT - number of pixels shifted to create each box (ex: (5,5)); if None, shift will be the same as the tile dimension);
  • OVERLAP_TILES - if tiles can overlap;
  • RENDER - render image as its being built;
  • POOL_SIZE - multiprocessing pool size;
  • IMAGE_TO_TILE - image to tile (ignored if passed as the 1st arg);
  • TILES_FOLDER - folder with tiles (ignored if passed as the 2nd arg);
  • OUT - result image filename.

Examples

Circles

Various sizes

Original cake image by pongsakornred from FLATICON.

Fixed
  • 10x10

  • 25x25
  • 50x50

Paper clips

Cross stitch (times)

Hearts

Legos

Minecraft blocks

Stripes (lines)

At

Hironsan/BossSensor

Hide screen when boss is approaching.

BossSensor

BossSensor

Hide your screen when your boss is approaching.

Demo

The boss stands up. He is approaching.

standup

When he is approaching, the program fetches face images and classifies the image.

approaching

If the image is classified as the Boss, it will monitor changes.

editor

Requirements

  • WebCamera
  • Python3.5
  • OSX
  • Anaconda
  • Lots of images of your boss and other person image

Put images into data/boss and data/other.

Usage

First, Train boss image.

1
$ python boss_train.py

Second, start BossSensor.

1
$ python camera_reader.py

Install

Install OpenCV, PyQt4, Anaconda.

1
2
3
4
5
conda create -n venv python=3.5
source activate venv
conda install -c https://conda.anaconda.org/menpo opencv3
conda install -c conda-forge tensorflow
pip install -r requirements.txt

Change Keras backend from Theano to TensorFlow.

Licence

MIT

Author

Hironsan

vipstone/faceai

一款入门级的人脸、视频、文字检测以及识别的项目.

vipstone/faceai

功能

  1. 人脸检测、识别(图片、视频)
  2. 轮廓标识
  3. 头像合成(给人戴帽子)
  4. 数字化妆(画口红、眉毛、眼睛等)
  5. 性别识别
  6. 表情识别(生气、厌恶、恐惧、开心、难过、惊喜、平静等七种情绪)
  7. 视频对象提取
  8. 图片修复(可用于水印去除)
  9. 图片自动上色
  10. 眼动追踪(待完善)
  11. 换脸(待完善)

查看功能预览↓↓↓

开发环境

  • Windows 10(x64)
  • Python 3.6.4
  • OpenCV 3.4.1
  • Dlib 19.8.1
  • face_recognition 1.2.2
  • keras 2.1.6
  • tensorflow 1.8.0
  • Tesseract OCR 4.0.0-beta.1

教程

OpenCV环境搭建

Tesseract OCR文字识别

图片人脸检测(OpenCV版)

图片人脸检测(Dlib版)

视频人脸检测(OpenCV版)

视频人脸检测(Dlib版)

脸部轮廓绘制

数字化妆

视频人脸识别

头像特效合成

性别识别

表情识别

视频对象提取

图片修复

其他教程

Ubuntu apt-get和pip源更换

pip/pip3更换国内源——Windows版

OpenCV添加中文

使用鼠标绘图——OpenCV

功能预览

绘制脸部轮廓

绘制脸部轮廓

人脸68个关键点标识

人脸68个关键点标识

头像特效合成

头像特效合成

性别识别

性别识别

表情识别

表情识别

数字化妆

视频人脸识别

视频人脸检测


视频人脸识别


视频人脸识别


图片修复


图片自动上色


技术方案

技术实现方案介绍

人脸识别:OpenCV / Dlib

人脸检测:face_recognition

性别识别:keras + tensorflow

文字识别:Tesseract OCR

TODO

换脸——待完善

眼睛移动方向检测——待完善

Dlib性能优化方案

Dlib模型训练方法

Tesseract模型训练方法

youyuge34/Anime-InPainting

An application tool of edge-connect, which can do anime inpainting and drawing. 动漫人物图片自动修复,去马赛克,填补,去瑕疵

youyuge34/Anime-InPainting

Anime-InPainting: An application Tool based on Edge-Connect

Version Status Platform PyTorch License


重要

2019.3.27 更新:
我们的最新模型 PI-REC 更强大.
如果你想用最新的AI绘画黑科技,而非仅仅是修补图像,请点击上面的链接👆


简介

Tool效果看上面👆 | Bilibili视频教程:TO DO

这是图像修补方向最新研究成果Edge-Connect阿姆斯特朗氮气加速魔改(优化)版。
Opencv写了个前端部分,后端是Edge-Connect,方便当作工具使用。
此工具可以用来自动图像修补,去马赛克……同样优化了模型训练的过程。具体优化内容请看英文版Improvements

更新:训练手册已经填坑完发布了!你可以照着指南训练自己数据集了~

基础环境

  • Python 3
  • PyTorch 1.0 (0.4 会报错)
  • NVIDIA GPU + CUDA cuDNN (当前版本已可选cpu,请修改config.yml中的DEVICE

第三方库安装

  • Clone this repo
  • 安装PyTorch和torchvision --> http://pytorch.org
  • 安装 python requirements:
1
pip install -r requirements.txt

运行Tool

教练!我有个大胆的想法🈲……别急,一步步来:

注意:以下模型是在动漫头像数据集上训练的,所以对动漫全身大图修补效果一般,想自己再训练的参考下面的训练指南

  1. 下训练好的模型文件 --> Google Drive | Baidu
  2. 解压 .7z 放到你的根目录下.
    确保你的目录现在是这样: ./model/getchu/<xxxxx.pth>
  3. 完成上面的基础环境和第三方库安装步骤
  4. (可选) 检查并编辑 ./model/getchu/config.yml 配置文件
  5. 使用以下命令运行:

默认Tool:

1
python tool_patch.py --path model/getchu/

带Edge编辑窗口的Tool:

1
python tool_patch.py --edge --path model/getchu/

命令行参数帮助

1
python tool_patch.py -h

PS. 你也能用tool跑别的任何模型,在这里下载原作更多模型Edge-Connect.
文件组织方式参考上面,其余运行命令都一样。唯一注意的是这个项目的 config.yml 比原作的多了几个选项,报错了的话注意修改。

Tool操作指南

详细内容请翻看控制台的打印内容,或查看tool_patch.py里的__doc__
简略版tool使用指南:

按键说明
鼠标左键Input窗口:画出瑕疵区域的遮盖,Edge窗口:手动画边缘
鼠标右键Edge窗口:橡皮擦
按键 [笔刷变细 (控制台打印粗细大小)
按键 ]笔刷变粗
按键 0Todo
按键 1Todo
按键 n修补黑色涂抹区域,只使用一张输入图片
按键 e修补黑色涂抹区域,使用输入图片和边缘图片(仅当edge窗口启动时有效)
按键 r全部重置
按键 s保存输出图片
按键 q退出

训练指南

训练指南 --> 阅读

Feature Matching(特征匹配)

  • 对水印模板图片进行了一些初始化处理,比如二值化后去除非文字部分等
  • 尝试了 OpenCV 的多种算法
    • 比如 ORB + Brute-Force,即蛮力匹配,对应 cv2.BFMatcher() 方法
    • 比如 SIFT + FLANN,即快速最近邻匹配,对应 cv2.BFMatcher() 方法
    • 比如 Template Matching,即模板匹配,对应 cv2.matchTemplate() 方法
  • 最后发现 Template Matching 最简单方便,效果也最好。
  • 如果水印位置固定的话则可以跳过Feature Matching(特征匹配),直接进行下一步的Inpainting(图片修复)

Inpainting(图片修复)

  • 修复图片前需要做一些前置处理
    • 首先要得到图片的去水印 Mask 图片,即和待处理图片一样大小的除了水印部分的文字部分外其他部分全部是黑色的位图
    • 因为前面对水印做了二值化等处理,最终效果发现会有水印轮廓,所以需要对 Mask 图片做一次膨胀处理覆盖掉轮廓
  • 选用了Telea在2004年提出的Telea算法,即基于快速行进(FMM)的修复算法

Todo

  • 由于某些图片的水印和背景图片相似程度太高,如何提高水印位置的识别正确率
  • 改进修复图片算法,可以考虑用深度学习来做做看?
  • Google CVPR 2017, 《On the Effectiveness of Visible Watermarks》这个据说很牛的,回头可以读一读

License

MIT