Answer the question

In order to leave comments, you need to log in

B

B

BadCats2021-01-08 21:01:17

2D Discrete Fourier Transform for Image (very big question - lots of text)?

**Essence of the issue:**

There is a binarized black and white image obtained after several stages of processing the original image using filtering methods in the spatial domain - a unified employee profile - black printed text on a white background and a black and white photograph in the corner of the document. Also, in some places, the background of the document has artifacts - "spots" of gray color - areas of pixels on some part of the image. Artifacts of gray color can be present both on a pure background and serve as a background-substrate in the text area - that is, in its main part - the text on a white background, but some of the letters, words can be on a gray "spot". It is required to filter the image in the frequency domain.

**Background and some clarifications (please don't skip):**

the first stage of frequency filtering will be the transition to the frequency domain using the DFT (I know that for adequate calculations you need to use the FFT, but I will figure it out step by step). I read about the DFT, in fact, as well as about image processing in the book "The World of Digital Image Processing" - S. Gonzalez, R. Woods. Let me just make two points:

- Before that, I didn't deal with image processing.
- Not familiar with DSP

- The library outputs only the Fourier spectrum in the form of an image, without the ability to use the "raw data" of the transformation from the spatial to the frequency domain
- The library is a "black" box, which makes it impossible to experiment in order to "twist and see what's going on" (I, as a beginner in all this, really need it - no one says to write bicycles all the time, but to understand a new area of \u200b\u200bwork - necessary)

In order to deal with the two-dimensional case - I started with something simpler - with a one-dimensional DFT for a wave (sonic or something else), I found two such examples:

```
/////////ДПФ
double Re = 0, Im = 0, summaRe = 0, summaIm = 0, Ak[512] = {0}, Ak_1[512] = {0}, Arg = 0;
for (int i = 0; i < 512; i++){
summaRe = 0; summaIm = 0;
for (int j = 0; j < 512; j++){
Arg = 2.0*M_PI*j*i/512.0;
Re = cos(Arg)*(massiv[j]);
Im = sin(Arg)*(massiv[j]);
summaRe = summaRe + Re;
summaIm = summaIm + Im;}
Ak[i] = sqrt(summaRe*summaRe + summaIm*summaIm);}
```

512 samples are set according to the condition of the original question - I'll replace it, let's say with the width of the image (if we imagine the image as a single strip (vector) of pixels). As I understand it, according to this formula from the book:

for the one-dimensional case of the DFT, there are two cycles - one for counting the points of the transformation itself and one for the sum of the values for each point.

Also, in principle, I understand where the sines and cosines come from (yes, I know that the DFT is, in fact, the expansion of the signal into harmonics, but I’m more about mathematical notation):

Another found example, in C #:

```
public class FurieTrans {
public List<Complex> koeffs;
public double K;
public FurieTrans() {
koeffs = new List<Complex>(); }
public void dpf(List<Point> points, int count) {
koeffs.Clear();
K = points.Count;
//Цикл вычисления коэффициентов
for(int u=0; u<count; u++)
{ //цикл суммы
Complex summa = new Complex();
for (int k = 0; k < K; k++) {
Complex S = new Complex(points[k].X, points[k].Y);
double koeff = -2 * Math.PI * u * k / K;
Complex e = new Complex(Math.Cos(koeff), Math.Sin(koeff));
summa += (S * e); }
koeffs.Add(summa/K); } }
public List<Complex> undpf() {
List<Complex> res = new List<Complex>();
for(int k=0; k<K-1; k++) {
Complex summa = new Complex();
for (int u = 0; u < koeffs.Count; u++ ) {
double koeff = 2 * Math.PI * u * k / K;
Complex e = new Complex(Math.Cos(koeff), Math.Sin(koeff));
summa+=(koeffs[u]*e); }
res.Add(summa); }
return res; } }
```

Now, based on all this and the formula from the book for the 2D DFT event:

- here's what I got:

```
static void MakeFurieTransform(FT_mode mode) {
Bitmap FFT_bitmap = new Bitmap(workBitmap.Width, workBitmap.Height);
double realPart = 0, imagePart = 0, rotAngle = 0, amplitude = 0;
double e = 0;
double summ = 0;
if (mode == FT_mode.direct) {
for (int u = 0; u < workBitmap.Width; u++) {
for (int v = 0; v < workBitmap.Height; v++) {
summ = 0;
//циклы суммы
for (int x = 0; x < workBitmap.Width; x++) {
for (int y = 0; y < workBitmap.Height; y++) {
double Rot_coefficient = -2 * Math.PI * (u * x / workBitmap.Width + v * y / workBitmap.Height);
realPart = Math.Cos(Rot_coefficient);
imagePart = Math.Sin(Rot_coefficient);
e = realPart + imagePart;
summ += workBitmap.GetPixel(x, y).ToArgb() * e; } }
//формируем Фурье-спектр
amplitude = Math.Sqrt(realPart * realPart + imagePart * imagePart);
rotAngle = Math.Atan(imagePart/realPart);
int_R = (int)amplitude;
NormalizePixel(ref int_R);
FFT_bitmap.SetPixel(u, v, Color.FromArgb(int_R, int_R, int_R)); } } }}
```

- Where I get the value of e -
`e = realPart + imagePart;`

- what to do with the complex part of i near the sine? Those. e^iAngle=cos(Angle)+ i sin(Angle) - what to do with the imaginary part during implementation? - Here:

- I don't quite understand what should serve as the new pixel color value. Am I doing the right thing?`amplitude = Math.Sqrt(realPart * realPart + imagePart * imagePart); rotAngle = Math.Atan(imagePart/realPart); int_R = (int)amplitude; NormalizePixel(ref int_R); FFT_bitmap.SetPixel(u, v, Color.FromArgb(int_R, int_R, int_R));`

Used sources:

Code example #1 - https://www.cyberforum.ru/cpp-builder/thread860204.html

Code example #2 - https://www.cyberforum.ru/digital-signal-processin...

https://habr.com/ru/company/otus/blog/449996/

psi-logic.narod.ru/fft/fft.htm (not all of the table of contents - only the first sections with theory)

1. Examples of source images/received spectrum? - no, unfortunately I can’t post it (I can’t post the spectrum - because it can be used to restore the original image)

2. Why "that's all"? - 1. I study like everyone else. Therefore, I run into a haystack with a rake. 2.Fmnalnaya goal "that's all" - the preparation of the original image (rather specific) for OCR.

3.Why c# and not, for example, C++? - I know a little C#, so I chose it; it's easier to read / write

4. Why work with images, partly DSP, and stick to this OCR, and not, for example, neural networks or *substitute your own version*? I don't know, honestly, I don't know. I have not worked with images or neurons before, therefore, I decided to go somehow like this "classic" way of processing images using filters.

5. Maybe all this is not necessary and focus on different OCR libraries? - maybe, but I tried to feed the original image as input for both Tesseract OCR and IronOCR - the result, frankly, is sad.

Answer the question

In order to leave comments, you need to log in

2 answer(s)

@Griboks

If you only want the frequency spectrum, you can just throw out the other part, but then the inverse conversion is not possible.

You have to get the frequency-phase spectrum, cut out unwanted frequencies, convert the spectrum back into an image.

In general, I advise you to read articles about the JPEG device and the discrete cosine transform on Habré. There were several articles that went into detail about all this and even experimented with other transformations / parameters / sampling tables / resampling ...

@U235U235

G

Griboks, 2021-01-08 @Griboks

Very interesting, nothing is clear.

Where I get the value of e - e = realPart + imagePart; - what to do with the complex part i near the sine? Those. e^iAngle=cos(Angle)+ i sin(Angle) - what to do with the imaginary part during implementation?

If you only want the frequency spectrum, you can just throw out the other part, but then the inverse conversion is not possible.

I don't quite understand what should serve as the new pixel color value. Am I doing the right thing?

You have to get the frequency-phase spectrum, cut out unwanted frequencies, convert the spectrum back into an image.

In general, I advise you to read articles about the JPEG device and the discrete cosine transform on Habré. There were several articles that went into detail about all this and even experimented with other transformations / parameters / sampling tables / resampling ...

U

U235U235, 2021-02-01 @U235U235

Why did you decide that you need filtering in the frequency domain? It's not at all obvious from your description. What do you want to get as a result?

If you want to experiment with pictures, then there is Imagemagick. This will eliminate the need to write code. https://legacy.imagemagick.org/Usage/fourier/

There are also examples of filtering in the frequency domain, such as typographic raster removal.

PS Gonzalez, Woods is a good book. :-)

Similar questions

K

D

M

A

O

Didn't find what you were looking for?

Ask your questionAsk a Question

731 491 924 answers to any question