D
D
denislysenko2021-10-20 12:54:23
MySQL
denislysenko, 2021-10-20 12:54:23

How to fix SQL script?

Hello. There are two tables in the database: movies (for 10,000 rows) and rating (for 100,000 rows)

the movies table has the following structure:

movie_id, title, year, genres
'1', 'Toy Story ', '1995', 'Adventure| Animation|Children|Comedy|Fantasy'
'2', 'Jumanji ', '1995', 'Adventure|Children|Fantasy'
'3', 'Grumpier Old Men ', '1995', 'Comedy|Romance'
'4', 'Waiting to Exhale ', '1995', 'Comedy|Drama|Romance'
...

That is, in the movies table there is a movie_id that only occurs once because it belongs to a particular movie, the movie's title, the year the movie was released,


'3', '6', '4.00'
'5', '50', '5.00'
'6', '3', '5.00'
...
here are the sql queries to create these tables:

CREATE TABLE movies (
movie_id        INT,
title           VARCHAR(256),
year            INT,
genres          TEXT,
PRIMARY KEY     (movie_id)
);

CREATE TABLE rating (
id              INT                 AUTO_INCREMENT,
movie_id        INT(20)             NOT NULL,
rating          DECIMAL(3, 2),
PRIMARY KEY     (id)
);


That is, in the rating table, ratings (reviews) about these films are collected, id is the review identifier and it is unique, movie_id is the id of the film that was rated (rating), rating is the rating that was given to this film. Each film has several ratings.

I need to output sorted movies to the console based on command line arguments
in the format:
Genre, title, year, svg(rating)

here is my code
import argparse
from config import host, user, passwd, db_name, default_genres
import pymysql
from pymysql.constants import CLIENT


def argpars():
    parser = argparse.ArgumentParser()
    parser.add_argument('-N',
                        type=int,
                        help='the number of the highest rated films for each genre',
                        default=1000
                        )
    parser.add_argument('-genres',
                        type=str,
                        help='filter by genre',
                        default ='Action|Adventure|Animation|Children|Comedy|Crime|Documentary|Drama|Fantasy|Film-Noir|Horror|IMAX|Musical|Mystery|Romance|Sci-Fi|Thriller|War|Western'
                        )
    parser.add_argument('-year_from',
                        type=int,
                        help='filter by year (FROM YEAR)',
                        default=1800
                        )
    parser.add_argument('-year_to',
                        type=int,
                        help='filter by year (TO YEAR)',
                        default=2025
                        )
    parser.add_argument('-regexp',
                        type=str,
                        help='filter on the movie name',
                        default=''
                        )
    return parser.parse_args()


def get_connection_to_db(host, user, passwd, db_name):
    connection = None
    try:
        connection = pymysql.connect(
            host = host,
            user = user,
            passwd = passwd,
            database = db_name,
            client_flag = CLIENT.MULTI_STATEMENTS
        )
        print('Connection to MySQl_db successful')
    except Error as e:
        print(f"The error '{e}' occurred")

    return connection


def print_result(connection, sql_script_path, args):
        genres = args.genres.split("|")
        for gener in genres:
            with connection.cursor() as cursor:
                with open(sql_script_path) as file:
                    script = file.read()
                    cursor.execute(script.format(year_from=args.year_from,
                                                 year_to=args.year_to,
                                                 name=connection.escape(args.regexp),
                                                 gener=connection.escape(gener),
                                                 rating=1,
                                                 N=args.N))
                    rows = cursor.fetchall()
                    for row in rows:
                        print('{};{};{};{}'.format(gener, row[0], row[1], row[2]))



def main():
    args = argpars()
    connection = get_connection_to_db(host, user, passwd, db_name)
    print_result(connection, './files/sql/GET_RESULT_MOVIES.sql', args)



main()


this is my sql query, where is the error in it? That is, I need this sql query to return the title of the movie, the year of the movie, and the average rating for this movie

SELECT m.title, year, avg(r.rating)
FROM movies AS m
JOIN rating AS r
ON m.movie_id = r.movie_id
WHERE m.year BETWEEN {year_from} and {year_to} AND m.title LIKE "%{name}%" AND m.genres LIKE "%{gener}%"
GROUP BY  r.movie_id, m.title, m.year
HAVING avg(r.rating) > {rating}
ORDER BY avg(r.rating) DESC, m.year DESC, m.title asc
LIMIT {N};


when I run the program, it doesn't output anything to the console.

Answer the question

In order to leave comments, you need to log in

1 answer(s)
G
galaxy, 2021-10-20
@denislysenko

There is an idea that the quotes interfere. This is what gener=connection.escape(gener)you get in the request LIKE "%'blabla'%".
connection.escape is generally not recommended for use :

def escape(self, obj, mapping=None):
        """Escape whatever value is passed.
        Non-standard, for internal use; do not use this in your applications.
        """

Form a string in python. Only if you don't want users to be able to use the LIKE special characters in the search string, they need to be escaped:
cursor.execute(script.format(year_from=args.year_from,
             year_to=args.year_to,
             name="%{}%".format( (args.regexp.translate(str.maketrans({'%': '\%', '_': '\_'})),) ),
             gener="%{}%".format( (gener.translate(str.maketrans({'%': '\%', '_': '\_'})),) ),
             rating=1,
             N=args.N))

SELECT m.title, year, avg(r.rating)
FROM movies AS m
JOIN rating AS r
ON m.movie_id = r.movie_id
WHERE m.year BETWEEN {year_from} and {year_to} AND m.title LIKE {name} AND m.genres LIKE {gener}
GROUP BY  r.movie_id, m.title, m.year
HAVING avg(r.rating) > {rating}
ORDER BY avg(r.rating) DESC, m.year DESC, m.title asc
LIMIT {N};

something like this

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question