-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unsupported feature 'unsupported_requested_format:snowflake'. #227
Comments
Hi @solderjs Sorry for the delay. This error is thrown because golang driver doesn't support PUT for now. I traced back through #223 and the only work around now is to use the external stage way that @howryu mentioned:
We will put a higher priority on the development for PUT command support for Golang. |
@ChTimTsubasa Thanks so much for getting back to me. What I did for now was buffer each stream to 25mb of validated csv, then shell out to snowsql to load the batch, and continue in a loop until the whole stream is done. HTTPSWhat would be absolutely best is if you would support a plain https GET. That would take nearly 0 effort on your part. All you have to do in your code is sidestep the azure / s3 API call and feed the http URL directly to the part where you would otherwise be getting the response back from s3 / azure. Ask one of your developers. I would be highly surprised if the code took more than 4 hours from first patch to fully tested - far less work that implementing PUT from start to finish (and far more efficient, faster, and less costly as it would allow direct streaming). To be clear: I just want to be able to give you a URL like this (with some option to specify a header as a bonus):
And then receive a request like this:
And then send the data back to you - the exact same way you already handle it with s3 and azure. Almost no code require on your part - and far, far simpler than what you already have. Absolute cake to implement. Azure MockingI tried just doing s3 mockingI couldn't do any s3 mocks because it looks like you don't support the full protocol. It seems like you have the AWS urls hard-coded, not dynamic. |
@solderjs Let me figure out your requirement, and please correct me if I am wrong: It seems that you have some cvs files on the https endpoints you hosted and want to load into snowflake.
|
Actually it's a csv stream - which is why using HTTPS is far better than PUT. From your point of view it's the same as if it were a file, but for me it means I just let your server know that I have it, and then when I see your request come in a moment later, I can pipe it straight to you through the network without ever having to store it to disk at all .
Thank you so much! |
Can you give a rough estimation about the data generation rate for the csv stream? |
@ChTimTsubasa It's a streamed upload, so it's bounded by network speed. Since I'm using Go the data will be piped in a way that applies back pressure. I won't upload it faster than we get it (obviously) and if we can't send it as fast as we can get, the TCP pipe will shorten the window, slowing the speed. Most files are under 100mb and transmitted infrequently (a few times a week), so it won't matter too much. Probably less than 5% of the files are over 100mb and probably something in the range of 0.1% will approach a gigabyte. Assuming standard http deflate/gzip compression those sizes could be less than half that size. |
@solderjs Faking S3 response would be hard since we piggyback on the S3 SDK on the server side to implement this and there are tons of optimization for COPY (to make it way faster than INSERT) which make the faking way even harder. For now I would say the best way is still using s3 uploading to external stage and then use copy command to load from stage to Snowflake if you insist using Golang driver. Or if you can bear with other drivers, you can use ODBC(c/c++) or JDBC(java) as well, since those drivers support PUT command so that you don't have to create and maintain your own S3 bucket. Snowflake is optimizing for batch processing which is the main goal for a data warehouse. We are trying to catching up with streaming data for ingestion but that is still under investigation now. |
@ChTimTsubasa I don't need anything from your end to support streaming. From your end I just need HTTPS for the ability to have you fetch a file - the same as is done with Azure and S3. That's "streaming enough". The issue with S3 is more that I don't see anywhere in your API to specify the URL. It appears that it's hard-coded to use Amazon. There are many S3 clones available, like minio, which are built to the spec of the s3 SDK, so it shouldn't be to hard to use one of them, but they require the ability to specify the domain (which you already support with Azure). If you could document what you expect from Azure, it looks like it would be trivial for me to reproduce. As far as I can tell it's just two requests - one to list the directory and one to fetch the file - but there must be some sort of option that you're expecting that I'm leaving out or not parameterizing correctly. Using S3 or Azure gives me no benefit and is just extra work (on both ends). My current work-around is to call out to the snowsql cli every 25mb and issue the COPY command there. Plain-old, tried and true, dumb, stupid, simple, uninteresting, boring HTTPS is all I want. |
@ChTimTsubasa Could you expose |
@solderjs, according to COPY team, it appeared it is not easy add HTTP endpoint support. I'm forwarding the request to them anyway. Thanks for great input! |
A message from the data pipeline team.
For further discussion, please reach out the support team to raise a ticket. Thanks. |
Issue description
I'm trying to follow the PUT documentation and I'm getting the above error. I'm not clear on whether the error is claiming that the Go client doesn't support CSV or if there's something else amiss.
https://docs.snowflake.net/manuals/sql-reference/sql/put.html
Example code
Error log
Configuration
Driver version (or git SHA):
Go version: run
go version
in your consoleClient OS: Amazon Ubuntu Linux
The text was updated successfully, but these errors were encountered: