There are a couple aspects to answering this question.
First, as stated in the comments, pthread_attr_setstacksize is The Right Way to do this. If the library calling pthread_create doesn't have a way to let you do this, fixing the library would be the ideal solution. If the thread is purely internal to the library (not calling code from the calling application) it really should set its own preference for the stack size based on something like PTHREAD_STACK_MIN + ITS_OWN_NEEDS. If it's calling back to your code, it should let you request how much stack space you need.
Second, as an implementation detail, glibc uses the stack limit from setrlimit/ulimit to derive the stack size for threads created by pthread_create. You can perhaps influence the size this way, but it's not portable, and as you've found, not reliable even there (it's not working when you call setrlimit from within the process itself). It's possible that glibc only probes the limit once when the relevant code is first initialized, so I would try moving the setrlimit call as early as possible in main to see if this helps.
Finally, the stack size for threads may not even be relevant to your application. Even if the stack size is 8MB, only the pages which have actually been modified (probably 4k or at most 8k unless you have big arrays on the stack) are actually using physical memory. The rest is just tying up virtual address space (of which you always have at least 2-3 GB) and possibly commit charge. By default, Linux enables overcommit, so commit charge will not be strictly enforced, and therefore the fact that glibc is requesting too much may not even matter. You could make the overcommit checking even less strict by writing a 1 to /proc/sys/vm/overcommit_memory, but this will cause you to loose information about when you're "running out of memory" and make your program crash instead. On such a constrained system you may prefer even stricter overcommit accounting, but then you have to fix the thread stack size problem...
setrlimitcall and what was the return value?pthread_attr_setstacksize().pthread_attr_setstacksize()because the pthread is created in the library I'm using. @mafso,setrlimitreturned 0. BTW, I setrlim_curandrlim_maxto 1MB, but it's as if the created thread is still using a stack size of 8MB.main()