如何从CSV文件复制到PostgreSQL表格中，并在CSV文件中添加标题？

1dkrff03 于 5个月前发布在 PostgreSQL

关注(0)|答案(7)|浏览(46)

我想复制一个CSV文件到Postgres表中。这个表中大约有100列，所以如果没有必要，我不想重写它们。
我正在使用\copy table from 'table.csv' delimiter ',' csv;命令，但没有创建表，我得到ERROR: relation "table" does not exist。如果我添加一个空白表，我不会得到错误，但什么也没有发生。我尝试了两三次这个命令，没有输出或消息，但当我通过PGAdmin检查它时，表没有更新。
有没有一种方法可以导入一个包含标题的表，就像我正在尝试做的那样？

csv

来源：https://stackoverflow.com/questions/17662631/how-to-copy-from-csv-file-to-postgresql-table-with-headers-in-csv-file

7条答案

按热度按时间

30byixjq1#

这个方法奏效了，第一行有列名。

COPY wheat FROM 'wheat_crop_data.csv' DELIMITER ';' CSV HEADER

字符串

赞(0）回复(0）举报 5个月前

e5njpo682#

使用Python库pandas，您可以轻松地创建列名并从csv文件中推断数据类型。

from sqlalchemy import create_engine
import pandas as pd

engine = create_engine('postgresql://user:pass@localhost/db_name')
df = pd.read_csv('/path/to/csv_file')
df.to_sql('pandas_db', engine)

字符串
if_exists参数可以设置为替换或附加到现有的表，例如df.to_sql('pandas_db', engine, if_exists='replace')。这也适用于其他输入文件类型，docs here和here。

赞(0）回复(0）举报 5个月前

s8vozzvw3#

无权限终端替代

NOTES上的pg文档说
路径将相对于服务器进程的工作目录（通常是集群的数据目录）进行解释，而不是客户端的工作目录。
所以，总的来说，使用psql或任何客户端，即使是在本地服务器上，你也会遇到问题.而且，如果你为其他用户表达COPY命令，例如.在Github README，读者会遇到问题.
表示客户端权限的相对路径的唯一方法是使用 STDIN，
当指定STDIN或STDOUT时，数据通过客户端和服务器之间的连接传输。
如remembered here：

psql -h remotehost -d remote_mydb -U myuser -c \
   "copy mytable (column1, column2) from STDIN with delimiter as ','" \
   < ./relative_path/file.csv

字符串

赞(0）回复(0）举报 5个月前

r9f1avp54#

我已经使用这个函数一段时间了，没有任何问题。你只需要提供csv文件中的数字列，它将从第一行获取标题名称并为你创建表：

create or replace function data.load_csv_file
    (
        target_table  text, -- name of the table that will be created
        csv_file_path text,
        col_count     integer
    )

    returns void

as $$

declare
    iter      integer; -- dummy integer to iterate columns with
    col       text; -- to keep column names in each iteration
    col_first text; -- first column name, e.g., top left corner on a csv file or spreadsheet

begin
    set schema 'data';

    create table temp_table ();

    -- add just enough number of columns
    for iter in 1..col_count
    loop
        execute format ('alter table temp_table add column col_%s text;', iter);
    end loop;

    -- copy the data from csv file
    execute format ('copy temp_table from %L with delimiter '','' quote ''"'' csv ', csv_file_path);

    iter := 1;
    col_first := (select col_1
                  from temp_table
                  limit 1);

    -- update the column names based on the first row which has the column names
    for col in execute format ('select unnest(string_to_array(trim(temp_table::text, ''()''), '','')) from temp_table where col_1 = %L', col_first)
    loop
        execute format ('alter table temp_table rename column col_%s to %s', iter, col);
        iter := iter + 1;
    end loop;

    -- delete the columns row // using quote_ident or %I does not work here!?
    execute format ('delete from temp_table where %s = %L', col_first, col_first);

    -- change the temp table name to the name given as parameter, if not blank
    if length (target_table) > 0 then
        execute format ('alter table temp_table rename to %I', target_table);
    end if;
end;

$$ language plpgsql;

字符串

赞(0）回复(0）举报 5个月前

sczxawaw5#

## csv with header
$ psql -U$db_user -h$db_host -p$db_port -d DB_NAME \
  -c "\COPY TB_NAME FROM 'data_sample.csv' WITH (FORMAT CSV, header);"

## csv without header
$ psql -U$db_user -h$db_host -p$db_port -d DB_NAME \
  -c "\COPY TB_NAME FROM 'data_sample.csv' WITH (FORMAT CSV);"

## csv without header, specify column
$ psql -U$db_user -h$db_host -p$db_port -d DB_NAME \
  -c "\COPY TB_NAME(COL1,COL2) FROM 'data_sample.csv' WITH (FORMAT CSV);"

字符串
csv中的所有列都应与表中的列相同（或与指定的列相同）
约为COPY
https://www.postgresql.org/docs/9.2/sql-copy.html

赞(0）回复(0）举报 5个月前

myss37ts6#

要使用查询从CSV文件复制到CSV文件中带有标题的PostgreSQL表：

首先添加C：/temp文件夹中的所有文件
然后编写下面的脚本，它接受空值和空字符串

copy PUBLIC."TABLE_NAME" FROM 
   'C:\tmp\TABLE_NAME.CSV' 
   (format csv, null "NULL", DELIMITER ',', HEADER);

字符串

赞(0）回复(0）举报 5个月前

eaf3rand7#

您可以使用d6tstack为您创建表，并且是faster than pd.to_sql()，因为它使用本地DB导入命令。它支持Postgres以及MYSQL和MS SQL。

import pandas as pd
df = pd.read_csv('table.csv')
uri_psql = 'postgresql+psycopg2://usr:pwd@localhost/db'
d6tstack.utils.pd_to_psql(df, uri_psql, 'table')

字符串
它也适用于导入多个CSV，解决数据模式更改和/或在写入数据库之前使用pandas进行预处理（例如日期），请参阅examples notebook中的进一步内容

d6tstack.combine_csv.CombinerCSV(glob.glob('*.csv'), 
    apply_after_read=apply_fun).to_psql_combine(uri_psql, 'table')

型

赞(0）回复(0）举报 5个月前

我来回答

如何从CSV文件复制到PostgreSQL表格中，并在CSV文件中添加标题？

7条答案

无权限终端替代

相关问题

热门标签

最新问答