My current task requires me to export an estimated 100,000 rows of data from a very large database.
I'm fairly new to handing big data and I would love to hear a few best practices and guidelines from people who have had prior experiences with these issues that worked for them in the past in an effort to make this post non-subjective.
A few more details:
The database is not at all normalized (very ugly)
I have to deal with at least 100,000 rows in total
The task is run at midnight with less users
Currently using ColdFusion 9, PostgreSQL 8.4
This is what my code looks like after applying Craig's solution:
<cfset base_path = GetDirectoryFromPath(ExpandPath("*.*")) & "some_parent\some_child\">
<cfif not DirectoryExists(base_path)> <cfdirectory directory="#base_path#" action="create" mode="777">
<cfset this_batch_path = DateFormat(Now(), 'mmddyyyy') & TimeFormat(Now(), 'hhmmss') & "\">
<cfdirectory directory="#base_path##this_batch_path#" action="create" mode="777">
<cfset this_filename = "someprefix_" & DateFormat(Now(), 'yyyymmdd') & ".csv">
<cffile action="write" file="#base_path##this_batch_path##this_filename#" output="">
<cfset escaped_copy_path = ListChangeDelims(base_path & this_batch_path & this_filename, "\\", "\")>
<cfquery name="qMyQuery" datasource="some_db" username="some_uname" password="some_pword" result="something"> COPY some_table TO '#escaped_copy_path#' WITH CSV HEADER;
Now I need to get the count of copied rows.
In the PGSQL 8.4 docs:
On successful completion, a COPY command returns a command tag of the form
The count is the number of rows copied.
But I can't seem to get it to work, even with the result tag and the query itself.